IMPROVED  EXACT  METHODS  EOR  STATISTICAL  INFERENCE 
IN  CONTINGENCY  TABLES 


By 

DONGUK  KIM 


A DISSERTATION  PRESENTED  TO  THE  GRADUATE  SCHOOL 
OF  THE  UNIVERSITY  OF  FLORIDA  IN  PARTIAL  FULFILLMENT 
OF  THE  REQUIREMENTS  FOR  THE  DEGREE  OF 
DOCTOR  OF  PHILOSOPHY 

UNIVERSITY  OF  FLORIDA 

1994 


UNIVERSITY  OF  FLORIDA  LIBRARIES 


© Copyright  1994 
by 

Donguk  Kim 


To  my  wife,  daughter 
and 

my  parents 


ACKNOWLEDGEMENTS 


I would  like  to  express  my  sincere  gratitude  to  Dr.  Alan  Agresti.  Without  his 
guidance  and  encouragement,  this  work  would  not  have  been  completed.  I would  like 
to  thank  Dr.  Mark  Yang,  Dr.  Myron  Chang,  Dr.  Brett  Presnell,  and  Dr.  David  Wilson 
for  their  encouragement  and  advice  while  serving  on  my  dissertation  committee. 

In  my  six  years  as  a student  here,  I learned  from  all  professors.  I would  also  like 
to  thank  Dr.  Yang  and  Dr.  Randles  for  all  the  support  while  1 worked  as  a consultant 
in  the  Biostatistics  Division  and  as  a teaching  assistant.  Also  my  thanks  go  to  all  my 
colleagues  and  friends. 

Einally,  1 wish  to  express  my  special  thanks  to  my  family,  especially  my  wife, 
YoungHee,  for  her  love,  patience,  and  encouragement,  and  my  daughter,  Minjee  for 
her  love.  Furthermore,  I would  like  to  thank  my  parents  for  their  love,  encouragement, 
and  support. 


IV 


TABLE  OF  CONTENTS 


ACKNOWLEDGEMENTS  iv 

ABSTRACT  vii 

CHAPTERS 

1 INTRODUCTION  1 

1.1  Literature  Review  1 

1.2  Summary  of  Dissertation  Work  6 

2 IMPROVED  EXACT  INFERENCE  ABOUT  CONDITIONAL  ASSO- 

CIATION   9 

2.1  Introduction  9 

2.2  A Less  Conservative  P-value  11 

2.3  A Less  Conservative  “Exact”  Confidence  Interval  31 

2.4  Alternative  Modifications  of  “Exact”  Confidence  Intervals  38 

2.5  Connections  with  Logistic  Regression  62 

2.6  Discussion  63 

3 APPROXIMATING  EXACT  INFERENCE  ABOUT  CONDITIONAL 

ASSOCIATION  64 

3.1  Introduction  64 

3.2  Tests  of  Conditional  Independence  Assuming  No  Three-factor 

Interaction  65 

3.3  Tests  of  Conditional  Independence  Permitting  Three-factor  In- 

teraction   72 

3.4  The  Construction  of  the  Modified  Exact  P-value  82 

3.5  Approximation  of  Exact  P-values  86 

3.6  Examples  89 

3.7  FORTRAN  Program  for  Simulation  94 

4 IMPROVED  EXACT  TESTS  FOR  ORDINAL  VARIABLES  IN  / x 

J X K TABLES  96 


v 


4.1  Introduction  96 

4.2  Basic  Results  in  Two-way  Contingency  Table  98 

4.3  Unbiasedness  of  Tests  in  Three-way  Contingency  Tables  104 

4.4  Complete  Class  of  Tests  115 

4.5  Admissible  Tests  116 

4.6  Exact,  Unbiased  and  Admissible  Tests  118 

4.7  Example  121 

4.8  Discussion  124 

5 CONCLUSION  125 

5.1  Discussion  125 

5.2  Future  Research  126 

APPENDICES 

A SOURCE  CODE  FOR  EXACT  INFERENCE  129 

B SOURCE  CODE  FOR  SIMULATION  209 

B.l  Program  Structure  209 

B.2  Part  of  Source  Code  211 

REFERENCES  248 

BIOGRAPHICAL  SKETCH  252 


VI 


Abstract  of  Dissertation  Presented  to  the  Graduate  School 
of  the  University  of  Florida  in  Partial  Fulfillment 
of  the  Requirements  for  the  Degree  of 
Doctor  of  Philosophy 

IMPROVED  EXACT  METHODS  FOR  STATISTICAL  INFERENCE 
IN  CONTINGENCY  TABLES 

By 

Donguk  Kim 
August  1994 

Chairman:  Alan  Agresti 
Major  Department:  Statistics 

Ordinary  “exact”  methods  can  be  highly  conservative  when  the  distribution  of  the 
test  statistic  is  discrete.  This  becomes  more  severe  as  the  number  of  dimensions  or  the 
number  of  categories  is  small.  We  improve  exact  inferential  methods  by  decreasing  the 
conservativeness  that  occurs  due  to  discreteness.  In  this  dissertation,  modifications 
of  exact  inferential  methods  are  suggested  for  conditional  associations  in  three-way 
contingency  tables.  Eor  testing  conditional  independence,  we  present  a modified 
P-value.  It  utilizes  both  the  usual  test  statistic  and,  at  the  observed  value  of  that 
statistic,  a supplementary  statistic  directed  toward  a broader  alternative.  For  2 x 2 x 7\ 
tables,  we  propose  modified  “exact”  confidence  intervals  for  an  assumed  common  odds 
ratio  based  on  inverting  two  separate  one-sided  tests  using  the  modified  P-value. 
We  also  present  an  alternative  and  usually  even  better  way  of  constructing  “exact” 
confidence  intervals,  based  on  inverting  a two-sided  test  with  a modified  P-value. 

For  / X J X A'  tables,  we  discuss  exact  tests  of  conditional  independence  using  six 
test  statistics  that  have  connections  with  loglinear  models.  Three  statistics  assume 
a lack  of  three-factor  interaction,  and  the  other  three  statistics  do  not  require  this 
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assumption.  All  six  statistics  are  score  statistics  for  loglinear  models  that  treat  none, 
one,  or  both  of  the  classifications  as  ordinal.  Then,  we  discuss  possible  alternative 
ways  of  forming  modified  exact  P-values  in  I x J x K contingency  tables,  and  we 
propose  modified  exact  P-values  for  six  tests  corresponding  to  six  loglinear  models. 
For  three-way  contingency  tables,  computational  algorithms  have  limited  availability 
for  tests  of  conditional  independence  when  I and  J exceed  two.  We  use  a simulation 
algorithm  to  obtain  precise  estimates  of  ordinary  and  modified  exact  P-values  for 
cases  for  which  the  current  computational  algorithms  are  infeasible. 

For  I X J X K tables,  we  show  how  to  construct  exact,  unbiased,  and  admissible 
tests  for  an  ordinal  alternative  to  conditional  independence  by  using  a modified  P- 
value  approach.  This  is  a generalization  of  the  results  of  Cohen  and  Sackrowitz  for 
a test  of  independence  in  two-way  contingency  tables  for  an  ordinal  alternative.  The 
ordinary  test  of  conditional  independence  for  2 x 2 x K contingency  tables  is  usually 
inadmissible. 


viii 


CHAPTER  1 
INTRODUCTION 


1.1  Literature  Review 


Statistical  inference  for  contingency  tables  generally  is  carried  out  by  large-sample 
approximations  for  sampling  distributions  of  the  test  statistic  rather  than  the  exact 
discrete  distribution.  A central  concern  is  the  quality  of  the  asymptotic  approxi- 
mation. Large-sample  approximations  apply  as  the  sample  size  grows,  for  a fixed 
number  of  cells.  The  adequacy  of  the  chi-square  approximation  depends  on  both  the 
sample  size  and  the  number  of  cells.  Some  contingency  tables  occur  where  the  sample 
size  is  too  small  to  apply  asymptotic  methods.  Also,  high-dimensional  contingency 
tables  tend  to  be  sparse,  and  as  a consequence  the  asymptotic  approximation  to  the 
sampling  distribution  is  often  very  poor.  Agresti  (1992)  surveyed  exact  inference  for 
contingency  tables  and  explained  the  developments  of  exact  methods  for  contingency 
tables.  He  suggested  the  use  of  exact  methods  instead  of  large-sample  approximations 
when  the  application  of  asymptotic  approximation  is  questionable.  We  focus  on  exact 
inferential  methods  for  conditional  associations  in  three-way  contingency  tables. 

When  the  exact  distribution  of  the  test  statistic  is  discrete,  it  is  known  that 
ordinary  “exact”  tests  and  confidence  intervals  can  be  highly  conservative  because  of 
the  discreteness  of  the  distribution.  Though  exact  tests  are  guaranteed  to  control  the 
probability  of  Type  I error  at  any  nominal  level,  we  may  not  achieve  a probability 
of  Type  I error  of  the  nominal  level  exactly.  The  actual  probability  of  Type  I error 
may  be  considerably  smaller.  For  instance,  in  a 2 x 2 contingency  table,  Fisher’s 
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exact  test  is  always  conservative.  For  exact  inference  about  a parameter  of  interest, 
we  condition  on  sufficient  statistics  for  unknown  parameters  to  eliminate  them.  For 
an  exact  conditional  test  for  categorical  data,  the  reference  set  of  tables  over  which 
the  exact  conditional  distribution  is  defined  is  the  set  of  contingency  tables  having 
certain  marginal  counts  fixed.  This  extra  conditioning  makes  the  distribution  of  the 
test  statistic  more  highly  discrete. 

Barnard  (1947)  proposed  an  unconditional  exact  test  for  2x2  contingency  tables. 
The  reference  set  of  his  test  is  defined  as  the  set  of  all  tables  with  fixed  row  margins 
and  all  possible  column  margins.  Since  the  column  margins  are  not  fixed,  this  uncon- 
ditional test  has  many  more  tables  in  the  reference  set,  and  the  distribution  of  the 
test  statistic  is  less  discrete.  A disadvantage  of  the  unconditional  test  is  that  com- 
putations are  infeasible  for  larger  tables,  since  maximizing  over  the  space  of  nuisance 
parameters  is  needed  for  implementation.  For  further  details,  see  Yates  (1984)  and 
Suissa  and  Shuster  (1985). 

One  way  to  reduce  conservativeness  is  the  mid  P adjustment.  Let  T be  a test 
statistic  and  tg  be  its  observed  value.  According  to  Lancaster  (1961),  the  mid  P ad- 
justment utilizes  half  of  the  probability  of  the  observed  value  of  T ; hence,  it  subtracts 
half  of  the  probability  of  the  observed  statistic  from  the  usual  exact  P-value.  This 
reduces  the  conservativeness  due  to  discreteness  and  does  not  rely  on  randomization 
to  eliminate  the  conservativeness.  But  one  drawback  is  that  it  can  not  guarantee 
exactness,  in  the  sense  that  the  actual  size  possibly  exceeds  the  nominal  level,  ft 
comes  from  the  fact  that  the  mid  P approach  subtracts  half  of  the  probability  of  the 
observed  statistic  from  the  exact  P-value. 

For  nonparametric  tests,  Streitberg  and  Roehmel  (1990)  considered  utilizing  a 
secondary  statistic  together  with  the  usual  statistic  to  discriminate  among  those  rank 
configurations  that  have  the  same  value  of  the  primary  statistic.  He  showed  that 
his  test  is  uniformly  more  powerful  than  the  Wilcoxon-Mann- Whitney  test,  and  the 
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P-value  of  this  test  employing  any  secondary  statistic  can  not  be  larger  than  the  P- 
value  from  the  ordinary  test.  A similar  approach  to  reduce  the  conservativeness  is 
due  to  Cohen  and  Sackrowitz  (1992).  They  suggested  a modified  P-value  that  utilizes 
both  the  usual  test  statistic  and,  at  the  observed  value  of  that  statistic,  the  null  table 
probability  for  a secondary  partitioning  for  those  tables  having  T = to-  Instead  of 
including  all  tables  having  T = to  in  the  calculation  of  the  P-value,  they  include 
tables  that  are  no  more  likely  than  the  observed.  They  used  this  for  ordinal  tests  in 
two-way  tables. 

Discreteness  also  alfects  interval  estimation.  An  “exact”  confidence  interval  for  a 
parameter  can  be  constructed  by  inverting  the  exact  conditional  test.  The  ordinary 
confidence  interval  (Cox  1970,  Gart  1970,  Mehta  et  al.  1985,  Vollset  et  al.  1991)  is 
based  on  inverting  two  separate  one-sided  tests  using  the  ordinary  P-value.  Because 
of  discreteness,  we  get  a conservative  confidence  interval.  The  actual  confidence 
coefficient  is  at  least  the  nominal  level. 

We  could  construct  an  exact  confidence  interval  based  on  inverting  a single  two- 
sided  test  rather  than  two  separate  one-sided  tests.  Using  a two-sided  approach, 
Sterne  (1954)  constructed  a confidence  interval  for  a single  binomial  parameter,  and 
Baptista  and  Pike  (1977)  constructed  confidence  limits  for  the  odds  ratio  in  a 2 x 2 
table.  This  two-sided  confidence  interval  also  is  conservative. 

Some  problems  arise  when  exact  methods  are  infeasible  and  the  application  of 
large-sample  approximations  is  questionable.  For  large-sample  inference  about  condi- 
tional association  in  three-way  contingency  tables.  Mantel  and  Haenszel  (1959)  gave 
a test  statistic  comparing  two  groups  on  a binary  response,  adjusting  for  control 
variables.  Since  Cochran  (1954)  proposed  a similar  statistic,  it  is  called  the  Cochran- 
Mantel-Haenszel  statistic.  This  is  a test  for  conditional  independence  in  2 x 2 x K 
tables.  Also,  Birch  (1964)  showed  that  under  the  assumption  of  a constant  odds  ratio 
within  each  of  the  tables,  this  test  is  uniformly  most  powerful  unbiased. 
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Birch  (1965)  derived  three  test  statistics  for  testing  the  null  hypothesis  of  condi- 
tional independence  of  two  variables  in  / x J x K contingency  tables.  These  are  score 
statistics  for  loglinear  models  that  none,  one,  or  both  of  the  classifications  are  ordinal. 
These  models  assume  a lack  of  three-factor  interaction.  When  both  classifications  are 
nominal,  the  corresponding  statistic  is  a generalized  Cochran-Mantel-Haenszel  test 
statistic  to  handle  more  than  two  groups  or  more  than  two  responses.  This  method 
involves  computing  the  expected  values  and  the  covariance  matrix  under  the  multi- 
ple hypergeometric  probability  model  for  each  of  the  tables.  These  quantities  then 
are  summed  across  the  tables,  and  a quadratic  form  of  the  test  statistic  is  gener- 
ated. When  both  classifications  are  ordinal,  the  corresponding  statistic  is  the  same 
as  Mantel’s  (1963)  score  statistic.  Furthermore,  Birch’s  statistics  are  special  cases 
of  a general  statistic  proposed  by  Landis  et  al.  (1978).  These  statistics  have  an 
asymptotic  chi-squared  distribution. 

Rather  than  use  large-sample  approximations,  we  wish  to  conduct  exact  inference. 
Even  though  recent  developments  make  exact  methods  feasible  for  some  inferential 
analyses,  because  of  computational  complexity,  we  do  not  have  exact  methods  for 
some  situations.  For  three-way  contingency  tables,  current  computational  algorithms 
for  exact  methods  are  restricted  to  certain  analyses  for  2 x J x K tables  with  ordered 
columns. 

The  Monte  Carlo  method  is  another  alternative  to  either  exact  or  asymptotic, 
methods.  This  method  is  based  on  estimating  the  exact  conditional  sampling  distri- 
bution of  the  statistic  by  generating  random  tables  having  the  relevant  fixed  margins. 
It  is  useful  for  those  situations  where  the  data  set  is  too  large  for  an  exact  computation 
or  too  sparse  to  rely  on  the  asymptotic,  theory.  For  table  generation  by  simulating 
from  a hypergeometric  distribution,  Boyett  (1979)  wrote  a program  that  generates  a 
two-way  random  table  from  the  exact  distribution  with  given  row  and  column  totals. 
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Patefield  (1981)  presented  a program  generating  a random  table,  and  his  program  is 
faster  than  Boyett’s  for  larger  sample  sizes. 

Agresti  et  al.  (1979)  utilized  the  Monte  Carlo  method  effectively  for  a variety  of 
tests  for  two-way  tables.  Even  for  large  tables  or  large  sample  sizes,  one  can  quickly 
approximate  as  closely  as  needed  the  ordinary  and  modified  exact  P-values  for  these 
statistics.  This  method  consists  of  sampling  contingency  tables  from  the  conditional 
reference  set  in  proportion  to  their  probabilities  and  computing  an  unbiased  point 
estimate  and  a narrow  confidence  interval  for  an  exact  P-value. 

When  we  construct  a critical  region  for  exact  tests  with  some  preassigned  nominal 
level  Q,  supplementary  randomization  would  be  required  at  the  boundary  of  the 
critical  region  in  order  to  achieve  the  nominal  size.  This  is  typical  for  any  discrete 
problem.  After  randomization,  the  resulting  test  may  be  inadmissible.  Cohen  and 
Sackrowitz  (1991)  focused  on  two-way  tables  and  showed  unbiasedness  for  the  test  of 
independence  in  two-way  tables  for  an  ordinal  alternative.  Eaton  (1970)  showed  the 
essentially  complete  class  in  an  exponential  family.  Eaton’s  theorem  shows  that  the 
essentially  complete  class  consists  of  tests  whose  acceptance  regions  are  convex  with 
possible  randomization  on  the  boundary  of  acceptance  region.  Furthermore,  Ledwina 
(1978a,  1984)  gave  the  class  of  admissible  rules  in  an  exponential  family.  Using  the 
same  argument  in  Ledwina,  Cohen  and  Sackrowitz  (1991)  proved  a theorem  that 
gives  the  class  of  exact,  unbiased,  and  admissible  tests  in  two-way  contingency  tables. 
They  constructed  the  exact  test  of  size  a by  ordering  the  tables  according  to  their 
probabilities  on  sample  points  where  the  test  would  randomize.  They  made  the 
number  of  tables  on  which  randomization  would  occur  considerably  smaller  than  in 
the  usual  test. 
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1.2  Summary  of  Dissertation  Work 


In  Chapter  2,  we  present  exact  tests  of  conditional  independence  against  the  al- 
ternative of  no  three- factor  interaction.  Our  modified  exact  tests  are  adaptations  of 
the  ordinary  exact  conditional  tests  that  are  less  conservative.  We  propose  a modified 
P-value  based  on  a secondary  partitioning  of  the  sample  space  beyond  that  generated 
by  the  test  statistic.  It  utilizes  both  the  usual  test  statistic  and,  at  the  observed  value 
of  that  statistic,  a supplementary  statistic  T'  directed  toward  a broader  alternative. 
In  the  calculation  of  the  P-value,  we  include  only  those  tables  that  are  at  least  as 
contradictory  to  the  null  in  terms  of  T' . One  can  calculate  this  modified  P-value 
for  any  test  statistic  having  a discrete  distribution.  The  modified  P-value  is  less  dis- 
crete than  the  ordinary  P-value,  does  not  employ  randomization,  and  leads  to  a less 
conservative  “exact”  test. 

By  inverting  results  of  tests  using  modified  P-values,  we  obtain  an  exact  and  less 
conservative  confidence  interval,  in  the  sense  that  the  modified  confidence  interval  has 
confidence  coefficient  at  least  the  nominal  level  and  is  narrower  than  the  ordinary  one. 
For  2 X 2 X K tables,  we  suggest  a modified  “exact”  confidence  interval  inverting  the 
test  based  on  a modified  one-sided  P-value  to  make  the  actual  confidence  coefficient 
closer  to  the  nominal  value.  Also,  we  present  an  alternative  and  usually  even  better 
way  of  constructing  “exact”  confidence  intervals,  based  on  inverting  a two-sided  test 
with  a modified  P-value. 

Furthermore,  we  utilize  the  mid  P-value  to  construct  intervals  applying  these 
methods,  although  these  are  not  exact.  To  compare  these  types  of  intervals,  we 
calculate  actual  coverage  probability  or  expected  length  of  the  confidence  intervals 
based  on  inverting  one-sided  or  two-sided  tests  using  the  ordinary  or  modified  P-value. 


7 


In  Chapter  3,  we  suggest  exact  inference  regarding  conditional  associations  in 
three-way  contingency  tables.  For  exact  tests  of  conditional  independence  in  / x J x K 
tables,  three  statistics  assuming  a lack  of  three-factor  interaction  are  discussed,  and 
then  we  provide  three  other  test  statistics  permitting  three-factor  interaction.  All  six 
test  statistics  are  score  statistics  for  loglinear  models  that  treat  none,  one,  or  both 
of  the  classifications  as  ordinal.  Also  they  have  asymptotic  chi-squared  distributions. 
Using  these  statistics,  we  propose  modified  exact  P-values  for  six  tests  for  testing 
conditional  independence  with  I x J x K tables. 

For  cases  that  are  currently  computationally  infeasible,  we  construct  a simulation 
algorithm  to  obtain  precise  estimates  of  ordinary  and  modified  exact  P-values,  using  a 
table-generation  procedure  suggested  by  Patefield  (1981).  We  utilize  six  test  statistics 
for  exact  tests  of  conditional  independence. 

In  Chapter  4,  we  generalize  results  of  Cohen  and  Sackrowitz  (1991,  1992)  to  con- 
struct exact,  unbiased,  and  admissible  tests  for  an  ordinal  alternative  to  conditional 
independence  for  I x J x K tables.  We  first  show  unbiasedness  of  tests  when  one 
wishes  to  test  a null  hypothesis  of  conditional  independence  against  the  alternative  of 
no  three-factor  interaction  model  in  three-way  contingency  tables.  Then  we  present 
the  complete  class  of  tests  and  admissible  tests  in  an  exponential  family  following 
Eaton  (1970)  and  Ledwina  (1978a,  1984).  Using  these  arguments,  we  generalize  to 
the  three-way  case  some  results  of  Cohen  and  Sackrowitz  regarding  admissibility  of 
tests  for  two-way  tables.  Combining  these,  we  have  a theorem  that  gives  the  class  of 
exact,  unbiased,  and  admissible  tests  in  three-way  contingency  tables. 

With  this  theorem,  we  discuss  how  to  construct  unbiased  tests  and  how  to  set 
up  critical  regions  to  obtain  tests  of  conditional  independence  of  fixed  size  a,  for 
an  ordinal  alternative.  We  construct  the  exact  test  of  size  cr  by  ordering  the  tables 
according  to  a secondary  statistic  directed  toward  a broader  alternative  hypothesis  at 
the  randomization  points,  utilizing  the  modified  approach  discussed  in  Chapter  2.  By 


8 


using  the  modified  approach,  the  resulting  test  is  admissible  after  randomization,  and 
it  requires  less  randomization  than  usual.  Also,  we  have  actual  size  closer  to  a nominal 
value.  The  Appendix  contains  a FORTRAN  program.  Using  this  program,  one  can 
easily  get  ordinary  and  modified  exact  inference  about  conditional  associations  for 
2 X 2 X K contingency  tables. 


CHAPTER  2 

IMPROVED  EXACT  INFERENCE  ABOUT  CONDITIONAL  ASSOCIATION 


2.1  Introduction 


When  a test  statistic  has  a discrete  distribution,  ordinary  “exact”  tests  and  con- 
fidence intervals  can  be  highly  conservative  due  to  discreteness.  If  we  conduct  a test 
using  some  preassigned  size  a,  the  probability  of  Type  I error  is  always  less  than  or 
equal  to  a preassigned  value.  If  one  constructs  an  “exact”  confidence  interval  with 
confidence  coefficient  1 — a,  the  actual  confidence  coefficient  is  at  least  that  level  and 
is  unknown  (Neyman  1935).  We  wish  to  improve  ordinary  exact  inferential  methods 
by  decreasing  the  conservativeness  that  occurs  due  to  discreteness.  In  this  chapter, 
we  suggest  modifications  of  exact  inferential  methods  for  conditional  associations  in 
2 X 2 X K contingency  tables. 

For  instance,  we  present  an  example  of  a 2 x 2 x 5 table  for  which  the  ordinary  95% 
confidence  interval  for  an  assumed  common  odds  ratio  is  (1.1,  531.5).  The  discreteness 
implies  that  .95  is  a lower  bound  for  the  actual  confidence  coefficient.  We  show  how 
to  construct  a modified  confidence  interval  that  also  has  the  guarantee  of  at  least  95% 
confidence,  but  takes  the  much  shorter  range  (2.1,  67.3).  Our  approach  is  applicable 
for  any  contingency  table  of  size  larger  than  2x2,  but  we  illustrate  the  arguments  in 
terms  of  inferences  about  conditional  associations  in  2 x 2 x K contingency  tables. 
The  ideas  and  notations  apply  throughout  the  dissertation.  In  this  chapter  we  are 
focusing  on  2 x 2 x K contingency  tables. 
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For  three-way  tables,  consider  the  hypothesis  of  conditional  independence  of  two 
variables,  given  the  third  one.  For  instance,  if  {’Kijk}  denote  probabilities  for  a multi- 
nomial distribution  over  the  I x J x K cells,  where  SSETTij^  = 1,  the  hypothesis 
states  that 

T^ijk  — +jk  I ^++k- 

The  subscript  “-f-”  denotes  the  sum  over  the  index  it  replaces.  Let  N = {riijA;}  denote 
the  cell  counts,  with  expected  frequences  {niijk}-  We  discuss  exact  conditional  tests 
of  this  hypothesis,  generalizing  Fisher’s  exact  test  for  2 x 2 tables.  We  also  discuss 
confidence  intervals  for  odds  ratios  pertaining  to  conditional  association. 

Let  X denote  the  row  classification,  Y the  column  classification,  and  Z the  layer 
classification.  The  hypothesis  of  conditional  independence  of  X and  Y,  given  Z,  is 
usually  tested  against  the  alternative  of  no  three-factor  interaction.  This  alternative 
is  the  loglinear  model  of  form 

log  77iijk  = //  + Af  + AJ  -f  Af  -b  \ -f  Xjif , (2.1) 

having  sufficient  statistics  ({n,j4.},  {rzi+A:},  The  null  hypothesis  corresponds 

to  the  special  case  of  this  model  in  which  all  A,A^  = 0.  Exact  conditional  tests  utilize 
the  distribution  of  the  sufficient  statistics  for  these  parameters,  conditional  on 

the  other  sufhcient  statistics,  that  relate  to  the  remaining  parameters.  For  the  case 
of  a 2 X 2 X K table,  for  instance,  one  uses  the  distribution  of  Y^kUuk,  conditional 
on  the  row  totals  {rij+fc}  and  column  totals  {n^jk}  for  the  partial  tables  (Birch  1964). 
The  parameter  of  interest  for  estimation  is  the  assumed  common  odds  ratio  for  each 
2x2  table. 

We  present  exact  tests  of  conditional  independence  for  the  alternative  of  no  three- 
factor  interaction.  Our  modified  exact  tests  are  adaptations  of  the  ordinary  exact 
conditional  tests  that  are  less  conservative.  They  use  a modified  P-value  based  on  a 
secondary  partitioning  of  the  sample  space  beyond  that  generated  by  the  test  statis- 
tic. It  utilizes  both  the  usual  test  statistic  and,  at  the  observed  value  of  that  statistic. 
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a supplementary  statistic  directed  toward  a broader  alternative.  A modified  P-value 
is  less  discrete  than  the  ordinary  P-value  and  leads  to  less  conservative  “exact”  tests. 
By  inverting  results  of  tests  using  modified  P-values,  we  have  an  exact  and  less  con- 
servative confidence  interval,  in  the  sense  that  a modified  confidence  interval  has 
confidence  coefficient  at  least  the  nominal  level,  and  it  is  narrower  than  the  ordinary 
one. 

Section  2 introduces  the  modified  P-value  and  shows  that  its  distribution  can  be 
much  less  discrete  than  that  of  the  ordinary  P-value.  We  compare  the  ordinary  and 
modified  P-values  with  examples.  Furthermore,  the  null  expected  value  of  the  P-value 
is  discussed  in  both  procedures  in  order  to  examine  the  degree  of  conservativeness. 
Section  3 discusses  modified  “exact”  confidence  intervals,  based  on  inverting  two 
one-sided  tests  using  the  modified  P-value.  Though  they  are  also  conservative,  they 
may  be  much  narrower  than  the  usual  one.  Illustrations  are  given  for  estimating  an 
assumed  common  odds  ratio  for  several  2x2  tables.  Section  4 presents  an  alternative 
and  usually  even  better  way  of  constructing  “exact”  confidence  intervals,  based  on 
inverting  a two-sided  test  with  a modified  P-value.  Section  5 discusses  some  related 
results  for  logistic  regression  models,  and  Section  6 gives  some  comments. 

2.2  A Less  Conservative  P-value 


Suppose  we  would  like  to  conduct  an  exact  conditional  test  for  categorical  data 
using  some  preassigned  size  a,  such  as  0.05.  Denote  by  P the  set  of  contingency 
tables  having  the  same  marginal  counts  as  the  ones  that  are  fixed  by  the  conditioning 
argument  for  the  exact  conditional  test.  This  is  the  set  of  tables  over  which  the 
exact  conditional  distribution  is  defined.  For  the  test  of  conditional  independence. 
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for  instance,  P is  the  set  of  / x J x K tables  of  nonnegative  integers,  P = {Z  : = 

n^jk,  TjjZijk  = Ui^k,  foralU,j,  ^}. 

It  is  usually  not  possible  to  construct  a critical  region  for  exact  conditional  tests 
with  preassigned  size  a because  of  the  discreteness  of  the  distribution.  If  an  exact 
test  is  desired  of  arbitrary  size  a,  supplementary  randomization  would  be  required 
to  make  the  decision  about  whether  to  reject  when  a table  occurs  at  the  boundary 
of  the  critical  region.  In  practice,  it  is  unacceptable  to  employ  randomization,  and 
one  normally  simply  reports  a P-value.  In  general,  suppose  we  have  a test  statistic 
T, such  as  a Wald,  likelihood  ratio,  or  score  statistic,  and  suppose  tg  is  the  observed 
value  of  T . If  large  values  of  T contradict  the  null,  the  usual  P-value  is 

P = Pho{T>U),  (2.2) 

the  probability  under  the  null  hypothesis  that  T is  at  least  tg.  Ordinarily,  if  one  wants 
to  make  a decision  about  //q,  one  rejects  if  the  P-value  < a.  The  discreteness  implies 
that  the  test  based  on  the  P-value  is  conservative  in  the  sense  that  the  actual  size  is 

Pho{P  < tt)  < « for  0 < a < 1.  (2.3) 

In  the  exact  conditional  approach,  one  conditions  on  sufficient  statistics  for  un- 
known parameters  in  order  to  eliminate  them.  Then,  the  tail  probability  that  de- 
termines the  P-value  does  not  depend  on  unknown  parameters  and  can  be  exactly 
calculated.  The  extra  conditioning  reduces  the  set  of  possible  test  statistic  values, 
making  the  distribution  more  highly  discrete.  Hence,  tests  of  nominal  size  a based  on 
the  exact  conditional  P-value  can  be  even  more  conservative.  The  actual  probability 
of  Type  I error  can  be  considerably  less  than  the  nominal  value  unless  the  sample  size 
is  reasonably  large.  This  problem  is  exacerbated  by  the  tendency  of  many  users  to 
put  too  much  emphasis  on  testing  at  sacred  levels  such  as  .05.  One  can  argue  that  one 
should  simply  report  the  P-value  and  not  make  comparisons  to  such  arbitrary  levels. 
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particularly  when  data  are  discrete.  However,  the  discreteness  also  affects  interval 
estimation. 


2.2.1  The  Modified  “Exact”  P-value 


To  reduce  the  degree  of  conservativeness,  we  suggest  a modified  P-value  based  on 
a less  discrete  distribution  than  that  of  T.  The  modified  P-value  uses  a partition 
of  the  sample  space  that  is  more  refined  than  we  get  using  T alone.  We  use  T to 
construct  a primary  partitioning  of  all  tables  that  have  the  sufficient  statistics  fixed 
by  the  conditional  test.  Then,  within  fixed  values  of  T,  we  generate  a secondary 
partitioning  using  some  other  index  T'  of  the  degree  to  which  the  data  contradict 
the  null  hypothesis.  The  statistic  T'  is  a test  statistic  directed  toward  a somewhat 
broader  alternative  hypothesis,  hence  detecting  information  that  may  be  missed  by 
T . Let  to  and  denote  the  observed  values  of  the  primary  and  secondary  statistic. 
The  modified  P-value  is  defined  as 

P*  = Pho{T  > to)  + PhAT  = to,  r > O,  (2.4) 

where  the  probabilities  are  computed  under  the  null  conditional  distribution.  Instead 
of  including  all  tables  having  T = to  in  the  calculation  of  the  P-value,  we  include  only 
those  that  are  at  least  as  contradictory  to  the  null  in  terms  of  having  at  least  as  large 
a value  of  T'. 

To  illustrate,  consider  testing  conditional  independence  in  2 X 2 X K tables.  Nor- 
mally, if  we  expect  about  the  same  strength  of  association  in  each  2x2  stratum,  we 
test  against  the  alternative  (2.1)  of  no  three-factor  interaction.  Using  this  narrow  al- 
ternative helps  to  build  power  compared  to  statistics  based  on  the  general  alternative, 
even  if  we  do  not  feel  that  reality  exactly  satisfies  (2.1).  Suppose  we  use  as  the  pri- 
mary statistic  the  score  statistic,  which  is  based  on  T = for  the  conditional 
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set  of  tables  having  the  same  row  and  column  totals  as  the  observed  table.  Then  one 
could  use  the  score  statistic  for  the  general  alternative  (the  saturated  model)  for  the 
secondary  partitioning.  This  is  simply  T = where  Xl  denotes  the  Pearson 

statistic  for  testing  independence  in  the  ^'th  partial  table.  The  secondary  statistic 
also  contains  information  about  the  validity  of  the  null  hypothesis,  but  is  directed 
toward  a wider  alternative. 

Another  possibility  for  the  secondary  partitioning  is  to  use  the  null  table  proba- 
bility, in  which  case  T'  can  be  expressed  as  the  negative  log  of  that  probability.  For 
a given  value  of  T,  tables  that  are  less  likely  under  the  null  are  then  considered  to 
give  greater  evidence  against  the  null.  Let  5 = {Z  : Z £ F,  T = to,  T’(Z)  < T’(N)}, 
where  the  probabilities  are  computed  under  the  null.  The  modified  P-value  is  then 

P;  = PH,{T>t,)  + PH,[B).  (2.5) 

The  modified  P-value  orders  sample  tables  in  P according  to  their  probabilities  when 
T = to-  Hence,  this  is  based  on  the  probability  of  the  observed  table  as  well  as  some 
test  statistic.  Cohen  and  Sackrowitz  (1992)  used  this  type  of  P-value  for  ordinal 
tests  in  two-way  tables.  We  will  compare  both  ways  of  forming  modified  P- values 
and  confidence  intervals  based  on  these  modified  P-values,  with  examples.  We  prefer 
P*  over  P*  for  the  modified  P-value,  because  both  T and  T'  are  score  statistics  for 
testing  conditional  independence. 

The  setting  and  the  statistic  T in  definitions  (2.4)  and  (2.5)  are  arbitrary.  One 
can  calculate  P*  for  any  test  statistic  having  a discrete  distribution,  since  it  satisfies 
Pho{P*  < «)  < « for  0 < O'  < 1.  We  show  that  under  the  null  this  modified  P-value 
has  the  property, 

Pho{P*  < o)  < O'  for  0 < O'  < 1.  (2.6) 

Let  P*  be  a modified  P-value  and  let  m be  a possible  marginal  configuration.  We 
first  show  that  the  conditional  P-value  has  Pho{P*  < a\m)  < a.  The  result  is 
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easily  obtained  by  noting  that  the  modified  P-value  is  a special  case  of  the  usual 
P-value  using  a more  refined  partitioning  of  T and  T' . The  ordinary  P-value  uses 
a partitioning  based  on  T,  and  it  is  the  sum  of  Ph^(T  = to)  and  the  probability 
of  more  extreme  values  of  T . The  modified  mid  P-value  uses  a partitioning  based 
on  T and  T'  within  T.  Let  Max(-)  denote  the  maximum  value,  let  Min(-)  denote 
the  minimum  value,  and  let  Gap(r)  denote  the  minimum  difference  between  two 
consecutive  values  of  T.  We  assume  that  T and  T'  have  positive  values.  Define  a 
new  statistic  T*  = T x Max(T'')/Gap(T')  -f  T'.  If  Min(T'')  equals  0,  we  transform 
from  T'  to  T'  + i in  order  to  avoid  ties  in  T*.  Then,  T*(Zi)  > T*{Z2)  for  all  tables 
Zi,Z2  with  T(Z\)  > T[Z2).  Let  t*  denote  the  value  of  T*  for  the  observed  table. 
Note  that  a partitioning  of  the  sample  space  using  T and  T'  within  T is  equivalent  to 
a partitioning  of  the  sample  space  usiirg  T* . Since  there  are  no  ties,  ordering  tables 
using  T and  T'  within  T is  equivalent  to  ordering  tables  using  T*.  Then,  the  sum  of 
the  probability  that  T'  is  at  least  T'^  dX  T = to  and  the  probability  of  more  extreme 
values  of  T is  equivalent  to  the  sum  of  Ph^[T*  = t*)  and  the  probability  of  more 
extreme  values  of  T*.  That  is, 

P*  = PHo{T>to)  + PHAT  = to,r>Q 
= PH,{T*>t:)  + PH,{r  = Q. 

Hence,  the  modified  P-value  is  a special  case  of  the  usual  P-value  with  a more  refined 
partitioning,  and  we  have  Ph^{P*  < a\m)  < a.  Then,  under  the  null, 

Pho{P*  < ot)  = E[Pho{P*  < a|m)]  < a,  (2.7) 

since  the  average  of  these  conditional  modified  P-values  over  all  possible  marginal 
configurations  is  less  than  or  equal  to  a.  Thus,  we  have  shown  that  the  probability 
of  Type  1 error  is  no  greater  than  the  nominal  value. 

The  modified  P-values  can  not  be  larger  than  the  ordinary  P-values,  so  the  test 
based  on  it  is  less  conservative  in  the  sense  that  the  actual  size  is  closer  to  the  nominal 
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value.  Also,  the  sampling  distribution  of  the  modified  P-value  is  less  discrete  than 
usual  in  the  sense  that  its  support  can  have  considerably  more  points.  When  each 
table  with  a particular  statistic  value  T has  the  same  value  of  T',  then  P*  is  the  same 
as  the  usual  exact  P-value.  As  a special  case,  when  there  is  only  one  table  having 
each  distinct  value  of  T,  such  as  in  Fisher’s  exact  test,  they  are  identical.  Note  that 
if  r is  a score  or  Wald  or  likelihood-ratio  statistic  for  a particular  alternative,  it  does 
not  help  to  take  T'  to  be  one  of  the  other  statistics  for  that  same  alternative.  Because 
these  tests  all  depend  only  on  the  sufficient  statistics  under  the  alternative,  two  tables 
that  have  the  same  value  of  T also  have  the  same  value  of  T' , when  T and  T'  are 
taken  from  these  procedures.  Thus,  we  base  T'  on  a more  general  alternative,  for 
which  the  extra  sufficient  statistic  provides  a finer  partitioning. 

When  a test  statistic  has  a continuous  distribution,  the  P-value  has  a uniform(0,l) 
null  distribution.  Hence,  for  the  continuous  case  the  expected  value  of  P-value  is  |. 
We  prove  now  that  in  the  discrete  case  the  expected  value  of  P under  the  null  is 
greater  than  For  an  arbitrary  random  variable  X (Mood,  Graybill  and  Boes  1974, 
page  65), 


EX 


roo  i-O 

/ [1  - Fx{x)]dx  - / Fx{x)dx 

Jo  J —oo 

fOO  yO 

/ [1  — Pr[A"  < x]]dx  — / Pr[A"  < x]dx. 

Jo  J —oo 


Thus,  EP  = /J[l  — Pr[P  < p]]dp.  Since,  from  (2.6)  1 — Pr[P  <p]>l— p,  0<p<l, 
we  have 


EP  > 


/ [1  -P]dp 

Jo 


1 

2‘ 


In  the  discrete  case,  the  P-value  is  stochastically  larger  than  the  uniform,  and  its 
expected  value  exceeds  Hence,  we  can  describe  the  degree  of  conservativeness  by 
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comparing  £’//gP  to  0.5.  If  the  expected  value  exceeds  0.5  by  much,  the  conservative- 
uess  is  severe. 


2.2.2  The  Modified  Mid  P-value 


The  mid  P-value  (Lancaster  1961)  is  another  alternative  to  the  usual  P-value  that 
many  statisticians  have  recommended  as  a way  of  compromising  between  having  a 
conservative  test  and  using  supplementary  randomization  {t.g.,  Barnard  1990).  It  is 
defined  by 

T„ud  = Pho{T  > to)  -h  [\I2)Pho[T  = to). 

It  subtracts  half  of  the  probability  of  the  observed  statistic  from  the  usual  exact 
P-value.  The  mid  P-value  has  the  appealing  property  that  its  null  expected  value 
for  a discrete  distribution  equals  exactly  the  expected  P-value  for  a continuous 
distribution.  A disadvantage  is  that  a test  based  on  it  is  no  longer  “exact,”  the  actual 
size  possibly  exceeding  the  nominal  value. 

The  mid  P-value  assigns  weight  1 to  probabilities  of  all  tables  comparable  to 
the  observed  table  in  the  sense  that  T = to-  For  the  modified  P-value  (2.4),  the 
comparable  tables  are  those  with  T = tg  and  T'  = Thus,  we  can  define  a mid  P 
version  of  the  modified  P-value  by 

P:.uA  = P*-  \PhAT  = to,  r = Q.  (2.8) 

Like  the  ordinary  mid  F’-value,  the  modified  mid  P-value  has  null  expected  value 
equal  to  The  result  is  easily  obtained  by  noting  that  the  modified  mid  P-value  is 
a special  case  of  the  usual  mid  P-value  using  a more  refined  partitioning  of  T and 
T'.  The  ordinary  mid  P-value  uses  a partitioning  based  on  T,  and  it  is  the  sum  of 
half  of  Ph^  [T  = to)  and  the  probability  of  more  extreme  values  of  T.  The  modified 


18 


mid  P-value  uses  a partitioning  based  on  T and  T within  T.  We  assume  that  T 
and  T'  have  positive  values.  Let  Gap(T')  denote  the  minimum  difference  between  two 
consecutive  values  of  T.  Define  a new  statistic  T*  = T x Max(r')/Gap(r)  + T'.  If 
Min(T’')  equals  0,  we  transform  from  T'  to  T'  + I in  order  to  avoid  ties  in  T*.  Then, 
T*{Z^)  > T*{Zt2)  for  all  tables  Zi,Z2  with  T{Zi)  > T{Z2).  Let  t*  denote  the  value  of 
T*  for  the  observed  table.  Note  that  a partitioning  of  the  sample  space  using  T and 
T'  within  T is  equivalent  to  a partitioning  of  the  sample  space  using  T* . Since  there 
are  no  ties,  ordering  tables  using  T and  T'  within  T is  equivalent  to  ordering  tables 
using  T*.  Then,  the  sum  of  half  of  Pho{T  = to,  T'  — and  the  probability  of  more 
extreme  values  of  T'  dX  T — to  and  more  extreme  values  of  T is  equivalent  to  the  sum 
of  half  of  Phq{T*  = t*)  and  the  probability  of  more  extreme  values  of  T*.  That  is, 

^mid  = PHo[T>to)PPHo{T  = to,r>Q  + {\l2)PuST^to,r^Q 
= PH,{T*>t:)  + {ii2)PHo{r  = t:). 

Hence,  the  modified  mid  P-value  is  a special  case  of  the  mid  P-value  with  a more 
refined  partitioning,  and  its  null  expected  value  is  equal  to  |.  Also,  the  difference 
between  the  modified  P-value  and  modihed  mid  P-value  is  less  than  the  difference 
between  the  ordinary  P-value  and  ordinary  mid  P-value.  That  is,  {P*  — ^inid)  ^ 
{P  - Pnud). 


2.2.3  Examples 


We  consider  the  test  of  conditional  independence  in  three-way  contingency  ta- 
bles under  the  assumption  of  no  three-factor  interaction.  We  will  illustrate  the  or- 
dinary and  modified  P- values  using  2x2x5  and  2 x 2 x 18  contingency  tables. 
For  2 X 2 X K tables,  the  exact  test  utilizes  the  test  statistic  T = given 
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{n\+k,n2+k,n+ik,n^2k}-  It  assumes  homogeneity  of  the  odds  ratios  in  the  2 x 2 x K 
contingency  tables.  For  modified  P-values,  we  can  utilize  the  table  proba- 

bility, F(Z),  for  the  secondary  statistic  T' . In  the  examples  we  utilize  Y^Xl  for  T'  in 
(2.4). 

We  illustrate  the  modified  P-values  (2.4)  and  (2.5)  using  Table  2.1,  taken  from 
Mantel  (1963).  It  refers  to  the  elfectiveness  of  immediately  injected  or  l|-hour- 
delayed  penicillin  in  protecting  rabbits  against  lethal  injection  with  /3-hemolytic  strep- 
tococci. Let  F’=penicillin  level,  D— delay,  and  C^whether  cured.  Under  the  assump- 
tion of  a constant  odds  ratio  0 between  D and  C at  each  level  of  P,  we  test  Hq  : 9 = \ 
against  //„  : 0 > 1.  Our  alternative  is  the  higher  cure  rate  for  immediate  injection. 
For  the  first  and  last  table,  the  zero  marginal  count  implies  that  the  conditional 
distribution  of  n-i\k  is  degenerate,  and  the  table  makes  no  contribution  to  the  test. 
Therefore,  we  can  conduct  the  test  using  the  three  remaining  tables. 

The  test  statistic  is  T = given  marginal  totals  of  row  and  column  variables 

at  each  level  of  the  third  one.  For  these  tables,  tg  = 14,  and  the  four  tables  with 
T > 14  are  {(nm,  nn2,  nns)  - (3, 6, 6),  (2, 6, 6),  (3, 5, 6),  (3, 6, 5)}.  The  values  of  T' 
for  these  four  tables  are  11.09,  7.54,  6.59,  and  11.09,  respectively.  Among  them, 
the  observed  table  is  (3,6,5).  The  ordinary  exact  P-value  is  C = CH.(r  > 14)  = 
(2-|-9-|-16-|-2)/1452  = 0.0200.  The  modified  exact  P-values  are  P*  = P*  = (2-|-2)/1452 
= 0.0028,  the  null  probability  for  the  tables  {(3, 6, 6),  (3, 6, 5)}. 

For  another  example,  we  consider  Table  2.2,  the  “crying  babies”  data  given  by 
Cox  (1970,  p.  5),  a 2 X 2 X 18  table.  On  each  of  18  days,  babies  not  crying  at  a 
specific  time  in  a hospital  ward  served  as  subjects.  On  each  day  one  baby  chosen 
at  random  formed  the  experimental  group,  and  the  remainder  were  controls.  Babies 
were  identified  as  crying  or  not  at  the  end  of  a specific  period.  For  these  tables,  the 
observed  values  are  to=15,  t(,=17.2601  and  the  P-values  are  P = 0.045,  P*  = 0.024, 
and  P*  = 0.021. 
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There  can  be  a considerable  discrepancy  between  the  behavior  of  the  ordinary  and 
modified  “exact”  P-values,  the  modified  one  having  a distribution  that  can  be  much 
less  discrete.  For  Table  2.1,  the  total  number  of  possible  P-values  equals  9 for  the 
ordinary  P-value,  32  for  P*,  and  35  for  P*.  For  Table  2.2,  the  corresponding  numbers 
are  19,  115938,  and  13110.  Figure  2.1  presents  the  cumulative  distribution  functions 
of  the  ordinary  exact  P-value  and  of  P*  for  null  conditional  distributions  based  on 
the  fixed  margins  of  Table  2.1.  Figure  2.2  presents  the  analogous  distributions  for  P*. 
Also,  Figures  2.3  and  2.4  display  the  corresponding  cumulative  distribution  functions 
for  null  conditional  distributions  based  on  the  fixed  margins  of  Table  2.2.  For  Table 
2.2,  the  modified  cdf  for  P*  or  P*  has  a distribution  practically  indistinguishable 
from  the  uniform. 

We  can  summarize  the  degree  of  conservativeness  of  each  P-value  using  P-value). 
Using  the  conditional  distribution  based  on  the  fixed  margins  of  Table  2.1,  Eh^P  — 
0.611  and  Eh,P*  = 0.545  and  Eh,P;  = 0.542.  For  Table  2.2,  Eh^P  = 0.576  and 
Eh,P*  =0.500  and  Eh^P;  = 0.501. 

We  now  illustrate  the  ordinary  and  modified  mid  P-values.  For  the  modified  mid 
P-value,  we  can  use  T'  = or  the  table  probability  for  the  secondary  statistic. 

For  Table  2.1,  = 0.011  and  = 0.002  for  both  modified  mid  P-values  using 

E X'l  or  the  table  probability.  For  Table  2.2,  P,„id  = 0.028,  and  P*^-^  = 0.024  with 
T'  = 0.021  with  the  table  probability.  Figures  2.5  and  2.6  present  the 

cumulative  distribution  functions  of  the  modified  exact  P-value  and  the  modified  mid 
P-value  using  T'  = the  corresponding  cumulative  distribution  functions 

using  the  table  probability  for  T\  respectively,  for  null  conditional  distributions  based 
on  the  margins  of  Table  2.1.  There  is  a good  contrast  between  the  behavior  of  the 
modified  “exact”  P-value  and  modified  mid  P-value.  The  modified  P-value  never 
exceeds  the  nominal  level,  but  the  modified  mid  P-value  can  exceed  it.  The  modified 
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mid  P-value  jumps  and  exceeds  the  nominal  value  before  the  modified  P-value  jumps 
closely  to  the  nominal  value. 

Figures  2.7  and  2.8  display  the  cumulative  distribution  functions  of  the  ordinary 
mid  P-value  and  the  modified  mid  P-value  using  T'  — 3-nd  the  corresponding 

cumulative  distribution  functions  using  the  table  probability  for  the  modified  mid 
P-value,  respectively,  for  the  null  conditional  distribution  based  on  the  margins  of 
Table  2.1.  Though  tests  based  on  the  ordinary  and  modified  mid  P-value  are  not 
“exact,”  the  gap  between  the  actual  size  and  the  nominal  level  tends  to  be  less  for 
the  modified  mid  P-value  than  for  the  ordinary  mid  P-value.  One  way  to  measure 
how  close  the  cdf  of  P is  to  the  uniform  cdf  is  by  the  measure 

M ^ J \F{x)-G{x)\dx, 

where  F = cdf  of  P and  G = uniform  cdf.  Using  Table  2.1  with  T'  = we 

have  M = 0.055  for  P„ud,  and  M = 0.022  for  P*^^.  For  the  exact  P-values,  we  have 
M = 0.111  for  P,  and  M = 0.045  for  P*. 


Table  2.1.  Example  for  exact  analyses. 


Penicillin 

Response 

Level 

Delay 

Cured 

Died 

1/8 

None 

0 

6 

1 1/2  Hour 

0 

5 

1/4 

None 

3 

3 

1 1 /2  Hour 

0 

6 

1/2 

None 

6 

0 

1 1/2  Hour 

2 

4 

1 

None 

5 

1 

1 1/2  Hour 

6 

0 

4 

None 

2 

0 

1 1/2  Hour 

5 

0 

Source:  Mantel  (1963) 
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Table  2.2.  Example  for  exact  analyses. 


Treated 

Control 

Day 

Not  Crying 

Crying 

Not  Crying 

Crying 

1 

1 

0 

3 

5 

2 

1 

0 

2 

4 

3 

1 

0 

1 

4 

4 

0 

1 

1 

5 

5 

1 

0 

4 

1 

6 

1 

0 

4 

5 

7 

1 

0 

5 

3 

8 

1 

0 

4 

4 

9 

1 

0 

3 

2 

10 

0 

1 

8 

1 

11 

1 

0 

5 

1 

12 

1 

0 

8 

1 

13 

1 

0 

5 

3 

14 

1 

0 

4 

1 

15 

1 

0 

4 

2 

16 

1 

0 

7 

1 

17 

0 

1 

4 

2 

18 

1 

0 

5 

3 

Source:  Cox  (1970) 
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P(P-value  <=x) 


Figure  2.1.  Two  cumulative  distribution  functions  of  exact  P-values  with  T'  — 
for  the  margins  of  Table  2.1. 
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P(P-value  <=x) 
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Figure  2.2.  Two  cumulative  distribution  functions  of  exact  P-values  with  T'  = P(Z) 
for  the  margins  of  Table  2.1.  ’ 
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P(P-value  <=x) 


Figure  2.3.  Two  cumulative  distribution  functions  of  exact  P-values  with  T'  = 
for  the  margins  of  Table  2.2. 
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P(P-value  <=x) 


Figure  2.4.  Two  cumulative  distribution  functions  of  exact  P-values  with  T'  = ^(Z), 
for  the  margins  of  Table  2.2. 
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P(P-value  <=x) 
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Figure  2.5.  Cumulative  distribution  functions  of  the  modified  exact  P-value  and  the 
modified  mid  P-value  with  T'  = for  the  margins  of  Table  2.1. 
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P(P-v^e  <=x) 
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Figure  2.6.  Cumulative  distribution  functions  of  the  modified  exact  P-value  and  the 
modified  mid  P-value  with  T'  = P{Z),  for  the  margins  of  Table  2.1. 
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P(P-value  <=x) 


Figure  2.7.  Cumulative  distribution  functions  of  the  ordinary  mid  P-value  and  the 
modified  mid  P-value  with  T'  = for  the  margins  of  Table  2.1. 
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P(P-value  <=x) 


Figure  2.8.  Cumulative  distribution  functions  of  the  ordinary  mid  P-value  and  the 
modified  mid  P-value  with  T'  = P(Z),  for  the  margins  of  Table  2.1. 
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2.2.4  Software 


Thomas  (1975)  gave  the  first  algorithm  for  exact  analysis  of  several  2x2  contin- 
gency tables.  This  FORTRAN  program  required  enumeration  of  all  possible  tables 
in  the  conditional  reference  set;  hence,  it  could  be  slow.  It  provided  exact  tests  for 
conditional  independence  as  well  as  an  exact  confidence  interval  for  a common  odds 
ratio,  and  computed  the  conditional  maximum  likelihood  estimate.  Vollset  and  Hirji 
(1991)  presented  a fast  FORTRAN  program  for  the  exact  test  of  conditional  indepen- 
dence and  confidence  interval  for  a common  odds  ratio  in  several  2x2  contingency 
tables. 

We  suggest  modifications  of  exact  methods  based  on  ordering  the  tables  by  their 
secondary  statistic.  In  order  to  implement  a modified  exact  test,  we  need  to  compare 
the  secondary  statistic,  T\  of  the  generated  table  to  that  of  the  observed  table,  for 
tables  such  that  T = to,  and  decide  whether  the  table  contributes  to  the  P-values. 
We  have  modified  Vollset  and  Hirji ’s  FORTRAN  program  to  implement  modified 
exact  P-values.  Also,  the  modified  software  can  compute  the  expected  value  and  the 
cumulative  distribution  of  P in  both  ordinary  and  modified  procedures.  The  source 
code  is  listed  as  Appendix  A. 


2.3  A Less  Conservative  “Exact”  Confidence  Interval 


Discreteness  also  affects  confidence  interval  estimation.  For  the  “exact”  confidence 
interval  with  nominal  confidence  coefficient  \ — a,  the  actual  confidence  coefficient  is 
at  least  that  level  and  is  unknown  (Neyman  1935).  Since  the  modified  P-value  is  less 
discrete  than  the  ordinary  P-value  and  leads  to  less  conservative  “exact”  tests,  we  can 
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reduce  the  conservativeness  by  employing  the  modified  P-value  for  the  construction 
of  confidence  intervals. 

For  2 X 2 X A tables,  we  suggest  modified  “exact”  confidence  intervals  for  an 
assumed  common  odds  ratio  based  on  inverting  results  of  tests  using  the  modified 
P-value.  Such  intervals  have  confidence  coefficient  guaranteed  to  equal  at  least  the 
nominal  level,  but  are  narrower  than  the  ordinary  “exact”  interval.  Illustrations  are 
given  for  estimating  an  assumed  common  odds  ratio  for  several  2x2  tables. 


2.3.1  The  Ordinary  “Exact”  Confidence  Interval 


One  can  construct  an  exact  confidence  interval  for  a parameter  by  inverting  the 
exact  conditional  test  regarding  the  value  of  that  parameter.  For  an  ordinary  exact 
confidence  interval,  one  can  invert  the  test  based  on  the  ordinary  exact  P-value. 

To  illustrate,  suppose  we  want  to  estimate  an  assumed  common  odds  ratio,  0,  in  a 
2 X 2 x A contingency  table.  The  conditional  probability  of  any  table  in  the  reference 
set,  F,  is 


P{{nuk}\{ni+k},  {n+u},  {rr+2fc};  0) 


2A: 


XZzer  Ha; 


Zk 


^1+^  j 


(2.9) 


where  {zj,  • • • ,2/^-}  denote  values  of  {rim,  ■ ■ ■ , for  a table  in  the  reference  set 

F.  Let  Ft  = {Z  : Z 6 F,  Y.k'^^nk  = t}.  Ordinary  exact  confidence  limits  for  the 
common  odds  ratio  are  constructed  from  the  conditional  distribution  of  T = J2k  ^Hik, 
that  is 


ct0^ 


E 


U tn 


P{T  = t-0) 


(2.10) 


where 


ct  = E n, 

zer, 


^+1A: 

Zk 


^+2k  \ 

^k  j 


and  where  i^3,x(0,  ni+^  — n+2/t)  and  i,nax  = min(ni4,/t,  n+i*,.).  The  ordi- 

nary interval  (Cox  1970,  Cart  1970,  Mehta  et  al.  1985,  Vollset  et  al.  1991)  is  based  on 
inverting  two  separate  one-sided  tests.  It  equals  (6»_,  6>+),  where  for  Cmn  < to  < Cnax, 

at  e = e_:  p,{e)  = Et>t^p{f,e)  = ^, 

e = 0+  : P.ie)  = = (2.11) 

When  to  = Cnin,  the  lower  endpoint  is  0;  if  to  = ^max,  the  upper  endpoint  is  oo.  It  is 
easily  shown  that  {0_{t),0^{t))  has  confidence  coefficient  at  least  100(1  - a)  (Mehta 
et  al.  1985).  Due  to  discreteness  of  the  distribution  of  T,  we  have  only  a conservative 
confidence  interval,  and  the  actual  confidence  coefficient  is  unknown. 


2.3.2  The  Modified  “Exact”  Confidence  Interval 


To  ensure  that  the  actual  confidence  coefficient  is  closer  to  the  nominal  value  and 
to  obtain  a narrower  “exact”  interval,  one  can  invert  the  two  one-sided  tests  based 
on  the  modified  exact  P-value.  We  illustrate  this  using  a secondary  statistic  J2^kW 
or  the  table  probability  to  generate  the  secondary  partitioning.  In  the  non-null  case, 
T'  is  defined  as 


T'  = T,xl(e)  = Y.T.P. 

k i j 


where  mijk{9)  is  the  estimate  of  the  expected  cell  count,  assuming  common  odds  ratio 
0.  When  0 = 1,  XX  ^l(^)  is  the  Pearson  statistic  for  testing  conditional  independence. 
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If  large  values  of  V contradict  the  null,  we  let  8(6)  = {Z  : Z 6 F,r  = to,T'{0)  > 
^o(^)}’  When  the  table  probability  is  utilized,  we  denote  PCZ^d)  as  the  probability 
of  table  Z when  the  common  odds  ratio  is  6,  and  let  8 (9)  = {Z  : Z e T,T  = 
toi  P{Z]  6)  < P(N;0)}.  The  modified  “exact”  confidence  limits  are  found  using  the 
functions 


P^O)  = ^t>tMt-,0)  + P[B{9y,9i 

= ^t<tMpo)  + p[8{ey,9].  (2.12) 

The  lower  limit,  9*_,  is  the  smallest  of  all  0’s  to  satisfy  P*{e)  > f,  and  the  upper 
limit,  0;,  is  the  largest  of  all  0’s  to  satisfy  P;{9)  > When  P*{9)  and  P;{9)  are 
strictly  monotone  functions  of  0,  the  limits  satisfy  P^{6*_)  = P^iOD  = 

We  show  that  the  probability  that  this  interval  excludes  0,  Pr(01  > 0)  + Pr(0^  < 
0),  is  at  most  a.  The  lower  limit  is  the  smallest  value  of  0 for  which  Tj*(0)  > For 
0 < 01,  Tr(0)  < f . It  follows  that 

Pr(01>0)  < Pr(P;(0)<|) 

= ilPr(P;(0)<||m) 
a 

^ r 

where  m denotes  a possible  marginal  configuration,  and  the  last  step  follows  because 
of  discreteness.  For  the  upper  limit,  by  the  same  arguments  we  have  Pr(0;lj_  < ^)<f- 
The  result  follows. 


Clearly,  this  interval  is  contained  within  the  ordinary  one.  Hence,  the  modified 
confidence  interval  is  “exact,”  yet  it  has  actual  confidence  coefficient  closer  to  the 
nominal  value  than  the  ordinary  “exact”  interval.  One  can  solve  for  the  modified 


endpoints  numerically,  based  on  the  ordinary  endpoints  as  the  initial  values.  The 
algorithm  to  find  the  endpoints  is  as  follows.  Start  with  an  initial  value  based  on  the 
ordinary  one,  since  the  modified  limits  are  contained  within  the  ordinary  ones.  Note 
that  P\{9)  and  P-iiO)  are  strictly  monotone  functions  of  9 (Mehta  et  al.  1985).  Also 
note  that  P^{9)  is  bounded  by  Pi{9),  and  P2*(^)  bounded  by  P2{9).  Even  though 
Pi  {9)  and  P-^iP)  are  not  monotone  functions  of  the  limits  can  be  found  within  the 
ordinary  limits  because  they  are  bounded  by  P\{9)  and  ^*2(^)5  respectively.  Hence 
ordinary  confidence  limits  provide  good  starting  values  for  both  the  monotone  case 
and  the  non-monotone  case.  The  initial  value  for  the  lower  limit  can  be  set  to  be  0_, 
and  the  initial  value  for  the  upper  limit  can  be  set  to  be  1.01  x 

Suppose  we  want  to  find  the  lower  limit.  Generally,  the  searching  algorithm  is 
composed  of  two  steps.  The  first  step  is  to  increase  the  value  of  9 until  some  value  of 
9 has  Pi  {9)  > |.  For  the  sake  of  the  non-monotone  case,  the  value  of  9 is  increased 
by  a small  amount  so  that  Pi{9)  can  not  change  much  between  two  values  of  0’s.  The 
second  step  is  iteration  within  an  interval  to  find  the  limit.  Denote  by  9a  the  most 
recent  estimate  that  has  P;{9)  < | and  denote  by  9b  the  most  recent  estimate  that 
has  Pi{9)  > |.  The  initial  values  of  9a  and  9s  are  set  to  be  zero.  As  9 changes,  9a 
or  9b  is  updated  depending  on  the  value  of  T’]*(0),  and  these  values  will  be  used  for 
the  second  stage  to  determine  an  interval  for  iteration. 

More  specifically,  if  Pi{9)  < the  current  estimate  is  too  small.  If  Pi{9)  > 
the  current  estimate  is  too  large.  For  the  first  step,  compute  P*{9)  at  the  initial  value 
of  9.  If  Pi  {9)  = |,  this  is  the  limit.  If  Pj*(0)  < f , multiply  9 by  1.01  to  increase  the 
value  of  9.  Using  this  new  estimate,  compute  P^*(0).  Continue  this  process  until  some 
estimate  is  found  that  has  Pi{9)  > Once  this  happens,  the  second  step  begins. 
Iteration  occurs  between  two  values  of  9.  These  two  values  are  the  previous  estimate 
that  has  Pi{9)  < | and  the  current  estimate  that  has  Pf"(0)  > |.  Note  that  9a  and 
9b  have  been  updated  as  the  estimate  changes.  Then  the  new  estimate  is  defined  as 
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and  Pi  {9)  is  computed  using  this  estimate.  Depending  on  the  value  of  Pi{6), 
9a  or  9b  is  updated.  The  process  continues  until  — 1 is  sufficiently  close  to  zero,  for 

example,  If  Pi{9)  and  T’2*(^)  ^^e  strictly  monotone  functions,  this  algorithm 

finds  the  limits  that  satisfy  Pi{9*_)  = T*2*(^+)  “ f • If  oot  a monotone  function,  it 
finds  the  smallest  of  all  0’s  to  satisfy  Pi{9)  > and  the  largest  of  all  0’s  to  satisfy 

P-zi^)  P f-  Thus,  this  algorithm  can  be  used  for  both  monotone  and  non-monotone 
cases. 

For  the  upper  limit,  the  same  procedure  follows  except  that  at  0 = 0+  if  Pi{9)  < 
multiply  0 by  0.99  to  decrease  the  value  of  0.  This  comes  from  the  fact  that  if 
P*{9)  < |,  the  current  estimate  is  too  large,  and  if  Pi{9)  > |,  the  current  estimate 
is  too  small.  This  algorithm  is  an  adaptation  of  one  written  by  Baptista  and  Pike 
(1977)  for  exact  two-sided  confidence  limits  for  an  odds  ratio  in  a 2 x 2 table. 

Next,  we  show  that  when  the  ordinary  P-value  and  the  modified  P-value  P*  based 
on  table  probabilities  are  identical,  then  the  ordinary  and  modified  exact  confidence 
intervals  (based  on  inverting  the  test  using  P*)  also  are  identical.  Suppose  we  use 
the  table  probability  for  T' . By  the  definition, 

P = Pho{T>Q, 

p;  = PHo{t>to)PPH,{{'l-.T  = U,P{Z)<P{N)]). 

When  the  ordinary  and  modified  P-values  are  identical,  we  have 

PhAT  = to)  = Pho{{Z  :T  = to,  P{Z)  < P(N)}). 

Hence,  the  observed  table  has  the  largest  null  probability  among  those  tables  having 
T = to-  This  means  that  when  0 = 1,  the  coefficient  for  the  observed  table 

n,  ( 

\ nuk  J \ «i+fc  - Tiiik 
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is  the  largest  among  those  coefficients  for  tables  having  T = tg.  Since  for  arbitrary  6 
we  get 


nt  { 

\ '^H\k 


^11/c 


0'^Hik 


'^-\-'2k 

^1+A:  “ ^lljt 


the  table  probability  for  arbitrary  9 depends  on  only  this  coefficient.  Because  the 
observed  table  has  the  largest  coefficient  among  those  tables  having  T = tg,  it  has 
the  largest  probability  among  those  tables  having  T = tg  for  arbitrary  9.  Hence, 
P{T  = tg\9)  = P[B[9)\9],  and  the  ordinary  and  modified  exact  confidence  intervals 
also  are  identical. 

This  property  does  not  hold  when  T'  = used  to  construct  the  modified 

P-value.  The  expected  cell  counts  in  T'  have  explicit  forms  under  the  null,  but  they  do 
not  have  explicit  forms  under  the  alternative  assuming  0,  though  they  can  be  obtained 
by  the  iterative  proportional  fitting  algorithm.  For  those  tables  having  T = tg,  if  the 
observed  table  has  the  smallest  value  of  T'  under  the  null,  it  does  not  necessarily 
have  the  smallest  value  of  T'  under  the  alternative.  Hence,  the  ordinary  and  modified 
exact  confidence  intervals  are  not  necessarily  identical  when  P = P* . 

We  now  illustrate  exact  confidence  intervals  for  a common  odds  ratio  using  Tables 
2.1  and  2.2.  The  95%  “exact”  interval  using  the  ordinary  approach  is  (1.08,531.51) 
for  Table  2.1  and  (0.86,21.37)  for  Table  2.2.  The  corresponding  modified  “exact” 
confidence  interval  using  V = E^|(^)  is  (2.08,67.35)  for  Table  2.1  and  (1.01, 13.63) 
for  Table  2.2.  Also,  the  corresponding  modified  “exact”  confidence  interval  using 
the  table  probability  for  T is  (2.08,67.35)  for  Table  2.1  and  (1.04,  14.87)  for  Table 
2.2.  We  see  that  inferences  can  be  considerably  sharper  with  the  modified  approach. 
For  Table  2.1,  for  instance,  the  lower  bound  of  the  ordinary  interval  indicates  that 
the  true  odds  ratio  could  be  quite  close  to  conditional  independence.  The  modified 
interval  suggests  that  the  odds  ratio  is  substantively  quite  different  from  conditional 
independence. 
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2--4 Alternative  Modifications  of  “Exact”  Confidence  Intervals 


In  previous  sections,  we  have  considered  two  types  of  probabilities,  that  is,  the 
probability  of  obtaining  T equal  to  or  less  than  the  observed  value  of  T = to,  and 
separately  the  probability  of  obtaining  T equal  to  or  greater  than  the  observed  value 
to-  Then,  confidence  liinits  are  constructed  by  inverting  the  test.  Hence,  confidence 
intervals  discussed  so  far  are  based  on  inverting  two  separate  one-sided  tests  of  level 
a/2  each.  We  now  suggest  an  alternative  way  to  form  an  “exact”  confidence  interval 
for  a common  odds  ratio.  This  method  is  based  on  inverting  a single  two-sided  test 
rather  than  two  one-sided  tests. 

We  show  that  confidence  intervals  based  on  inverting  two-sided  tests  tend  to  be 
less  conservative  than  those  based  on  inverting  two  separate  one-sided  tests.  Also  we 
discuss  modified  mid  P confidence  intervals  based  on  inverting  one-sided  or  two-sided 
tests  using  modified  mid  P- values. 

2.-.4.1 The  Ordinary  Two-Sided  “Exact”  Confidence  Interval 


Sterne  (1954)  used  a two-sided  approach  in  constructing  a confidence  interval 
for  a single  binomial  parameter,  and  Baptista  and  Pike  (1977)  used  it  to  construct 
confidence  limits  for  the  odds  ratio  in  a 2 x 2 table.  We  can  extend  this  directly  to 
2 X 2 X K tables.  For  testing  a particular  value  of  9,  a two-sided  P-value  is  given  by 

P{9)=  P{t-9).  (2.13) 

{t  ■.  p(t-,e)<p(to-,e)} 

When  the  distribution  of  T has  probabilities  monotonically  increasing  in  t up  to 
some  point  and  then  monotonically  decreasing  after  that,  this  is  simply  a two-tail 
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probability.  (This  has  happened  for  all  examples  we  have  considered,  and  it  may 
indeed  be  a property  of  the  distribution  of  T for  2 x 2 x K tables;  however,  except 
for  K = 1,  it  does  not  seem  to  be  known  whether  the  distribution  of  a sum  of 
noncentral  hypergeometric  variates  is  unimodal.)  The  two-sided  exact  confidence 
interval  then  consists  of  the  values  for  6 for  which  this  two-sided  P-value  equals  at 
least  a.  Alternatively,  one  could  base  the  two-sided  P-value  on  a non-null  test  statistic 
(such  as  the  score  statistic),  and  construct  the  confidence  interval  by  inverting  that 
test  using  the  exact  non-null  distribution.  We  will  discuss  this  in  Chapter  5. 

This  two-sided  approach  produces  an  interval  that  is  usually,  but  not  necessarily, 
shorter  than  the  ordinary  one  based  on  inverting  two  separate  one-sided  tests.  Under 
certain  conditions,  it  can  be  shown  that  the  two-sided  approach  is  better,  at  least 
for  one  of  the  endpoints.  For  instance,  when  the  upper  limit  9+  of  this  interval  is 
quite  large,  the  distribution  of  T often  satisfies  P{t-,9+)  > P{to\9+)  for  all  t > to-  A 
special  case  of  this  holds  when  the  probabilities  are  monotone  increasing  in  t,  which 
is  guaranteed  when  6+  > maxt{ct_i/c<}.  In  order  to  show  this,  from  (2.10)  we  have 


P{T  = t-e^)  = 


Ci9\ 


Etmax  ^ Dt 


For  tniin  f ^ Ciiax5 


PiT  = t-e+)-P{T  = t-\-,9^)  = 


1 


E^max  « fiu 

«=<min  ^“'^-1- 

E^max  ^ flu 

^ "1“ 


(ct^+  — Ct-i9^  ) 


(c(^-t-  Q-i  )• 


If  9^  > ^^,P(T  = t\9j^)  > P[T  = t — 1;^+)  for  arbitrary  t.  Hence,  if  9^  > 
max({cj_i/c( },  the  probabilities  are  monotone  increasing  in  t.  In  this  case,  since 
P{t]  9^)  > P{to\ 9^)  for  all  t > tg, 


Y.  = a- 


t<to 
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Hence,  this  upper  limit  0^  is  the  same  as  the  upper  limit  obtained  using  the  one-sided 
testing  approach  with  double  the  error  probability.  For  instance,  the  upper  limit  of 
the  95%  interval  based  on  inverting  a two-sided  test  is  then  the  same  as  the  upper 
limit  of  the  90%  interval  for  the  approach  based  on  inverting  two  separate  one-sided 
tests.  Analogous  remarks  apply  to  the  lower  limit.  In  such  cases,  there  is  a clear 
advantage  to  using  this  approach  based  on  two-sided  tests.  Unless  one  is  specifically 
interested  in  a one-sided  confidence  interval  (he.,  a lower  bound  alone  or  an  upper 
bound  alone  for  0),  we  prefer  this  approach. 


2.4.2  The  Modified  Two-Sided  “Exact”  Confidence  Interval 


Following  the  modified  approach  of  the  previous  section,  one  can  construct  a 
modification  of  this  confidence  interval  based  on  two-sided  tests  by  using  a modified 
P-value.  We  define  a modified  two-sided  P-value  for  testing  a particular  value  of  6 as 

p*{0)  = P(e)  - P{{Z  : Z G r,P{t;0)  = P{t,,0),T'{O)  < a^)}).  (2.14) 

Again,  if  we  use  the  table  probability  for  the  secondary  partitioning,  we  define  a 
modified  two-sided  P-value  for  testing  a particular  value  of  0 as 

p;{0)  = P{0)  - P{{Z  ; Z € T,P{t-0)  = P{to,0),P{Z;0)  > P{N-,0)}).  (2.15) 

For  the  modified  two-sided  confidence  interval,  we  consider  the  shortest  interval  that 
contains  all  of  the  the  values  of  0 for  which 

P^0)>a.  (2.16) 

The  lower  limit,  6*1,  is  the  smallest  0 satisfying  (2.16),  and  the  upper  limit,  01J.,  is  the 
largest  0 satisfying  (2.16).  We  show  that  this  confidence  interval  is  “exact.”  For  all 
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values  of  9 lying  outside  the  closed  interval  9*_  < 9 < 9%,  it  follows  that  P*{9)  < a. 
Then 


Vv{9  < 9*_,9  > 9\)  < ?x{P*{9)<a) 


< Vx{P*{9)  < a) 

= EPv{P*{9)  <a\m) 


< a. 


Hence,  Pr(^l  <9<  9*^)  > 1 — a. 

This  approach  gives  even  narrower  intervals  than  obtained  by  inverting  the  two- 
sided  test  with  the  ordinary  P-value.  Note  that  9-  is  the  smallest  9 satisfying  P{9)  > 
a.  Thus,  before  9_,  there  is  no  point  having  P{9)  > a.  Also  note  that  P*{9)  is 
bounded  by  P{9)  and  P*(9)  < P{9).  For  instance,  at  the  ordinary  lower  limit,  if 
P*{9-)  = P{9_),  then  9l_  = 0_.  Otherwise,  01  > 0_.  By  a symmetric  argument, 

< 0-I--  Hence,  the  two-sided  modified  confidence  interval  is  contained  within  the 
two-sided  ordinary  confidence  interval. 

We  illustrate  these  alternative  “exact”  confidence  intervals  for  the  common  odds 
ratio  using  Tables  2.1  and  2.2.  For  Table  2.1  the  95%  confidence  interval  by  inverting  a 
two-sided  test  is  (1.29,  261.49)  based  on  the  ordinary  exact  P-values  and  (1.38,  40.45) 
based  on  modified  exact  P-values,  P*{9)  and  P*{9).  Using  Table  2.2  the  confidence 
intervals  are  (0.88,  15.92)  using  the  ordinary  exact  P-values,  (1.01,  10.30)  using  P*{9), 
and  (1.01,  11.14)  using  P;{9). 

Table  2.3  contains  95%  confidence  intervals  obtained  using  the  two  separate  one- 
sided ordinary  and  modified  exact  P-values,  and  using  the  ordinary  and  modified 
two-sided  exact  P-values.  For  these  tables,  the  confidence  interval  constructed  using 
the  ordinary  two-sided  P-value  is  shorter  than  the  ordinary  one  based  on  two  one- 
sided P-values.  In  fact,  for  each  data  set,  the  upper  endpoint  for  the  two-sided  based 
interval  equals  the  endpoint  that  would  be  obtained  with  the  one-sided  method  for 
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a 90%  confidence  interval.  For  each  type  of  interval,  the  ones  based  on  the  modified 
P-value  are  narrower  yet.  For  Table  2.2  the  modified  confidence  interval  based  on 
T'  = Yik  is  shorter  than  the  corresponding  confidence  interval  based  on  the  table 
probability  in  both  one-sided  and  two-sided  cases. 

One  way  to  compare  the  methods  to  construct  the  confidence  interval  and  to  cal- 
culate some  degree  of  the  conservativeness  is  using  the  coverage  function  (Vollset  and 
Hirji  1991).  The  coverage  function,  for  a given  value  of  0,  is  computed  by  summation 
of  9)  over  t for  which  the  confidence  interval  contains  the  given  value  of  9.  The 
function  is  then  plotted  as  a function  of  9.  Hence,  it  displays  how  closely  the  actual 
coverage  probability  falls  to  the  nominal  coverage  probability. 

For  the  conditional  distribution  having  the  fixed  marginal  counts  of  Table  2.1, 
Figures  2.9  and  2.10  show  the  actual  coverage  probability  as  a function  of  the  true 
log  odds  ratio,  for  95%  confidence  intervals  based  on  inverting  separate  one-sided  tests 
using  the  ordinary  or  modified  P-value.  We  use  for  Figure  2.9  and  the  table 

probability  for  Figure  2.10,  for  the  secondary  partitioning  in  the  modified  P-value. 
There  is  a clear  advantage  to  using  the  interval  based  on  the  modified  P-value.  For 
Table  2.2,  this  calculation  requires  a huge  computing  time,  and  we  have  not  been  able 
to  get  results  using  the  conditional  distribution  based  on  the  margins  of  all  18  partial 
tables.  Thus,  we  display  results  using  various  subsets  of  the  partial  tables  of  Table 
2.2.  Figure  2.11  gives  an  analogous  display  using  various  numbers  of  partial  tables 
from  Table  2.2.  It  shows  how  the  conservativeness  is  reduced  by  using  confidence 
intervals  based  on  inverting  tests  with  modified  P-values.  As  the  number  of  strata 
increases,  the  modified  approach  yields  actual  level  closer  to  the  nominal  level,  and 
this  holds  over  a broader  range  of  odds  ratio  values. 

For  either  approach,  for  sufficiently  large  9,  all  tables  with  those  margins  would 
have  lower  bound  of  the  interval  below  9]  for  sufficiently  small  9,  all  tables  would 
have  upper  bound  above  9.  In  such  cases,  the  actual  probability  of  coverage  of  a 
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100(1  - a)%  confidence  interval  has  lower  bound  1 - a/2.  That  bound  is  achieved 
at  values  of  9 that  are  potential  endpoints  of  the  intervals  (Neyinan  1935).  To  show 
this,  let  {9_,9^)  denote  the  ordinary  interval  based  on  a one-sided  test.  Suppose  that 
the  value  of  the  upper  limit,  is  large  enough  so  that  all  the  lower  limits  from  other 
possible  tables  are  less  than  9^.  Since  9j^  is  constructed  by  inverting  the  one-sided 
a/2  test,  we  have  P{T  < tg',  9j^)  = a/2  and  P{T  > to  + 1;  ^+)  = 1 — o;/2  accordingly. 
The  coverage  function  at  0 = 0+  is 

C{9^)  = Y.nt.0^)P{t-9^) 

t 

= P(t;  t > to  + l;9+) 

= 

where  l{t,9+)  is  a indicator  function  to  indicate  whether  or  not  9+  is  within  the 
confidence  interval  at  T = t.  Note  that  at  0 = 6>+,  we  have  P{T  < to]9+]  = a/2, 
and  9^  is  the  upper  limit.  At  some  value  of  T = the  fact  that  9^  is  within  this 

interval  corresponds  to  P(T  < T;  0^.)  > a/2.  In  order  to  satisfy  this,  we  need  to  have 
t'  > to  + 1,  since  P{T  < to]  9^)  = a/2.  Hence,  the  coverage  probability  that  is  the 
summation  of  P{t]  9+)  over  t such  that  f > fo  + 1 is  1 - a/2.  For  9 > 9+  the  coverage 
function  has  P{9)  > 1 — a/2. 

Figures  2.12  and  2.13  give  an  analogous  display  for  the  confidence  intervals  based 
on  inverting  two-sided  tests  using  the  ordinary  or  modified  P-value  using  Table  2.1. 
For  the  secondary  statistic  T\  Figure  2.12  uses  Y^Xl{9)  and  Figure  2.13  uses  the 
table  probability.  Again,  there  is  an  advantage  to  the  interval  based  on  the  modified 
P-value.  Comparing  the  figures  of  coverage  probability  for  confidence  intervals,  we 
see  there  is  almost  always  an  advantage  to  using  the  confidence  interval  based  on 
inverting  two-sided  tests.  Figure  2.14  gives  an  analogous  display  using  some  fixed 
sets  of  margins  of  Table  2.2.  There  is  a dramatic  improvement  in  the  two-sided 
modified  confidence  intervals,  when  the  number  of  strata  is  large.  As  the  number  of 
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strata  increases,  we  can  expect  that  actual  coverage  probability  is  very  close  to  the 
nominal  coverage  probability.  When  log  9 is  between  -2  and  2,  we  see  there  is  a large 
increase  in  the  coverage  probability  for  both  the  ordinary  two-sided  and  modified  two- 
sided  confidence  intervals.  At  that  point,  many  new  tables  for  which  the  confidence 
intervals  contain  the  given  value  of  6 are  added  to  the  calculation  of  the  coverage 
probability,  and  the  jump  comes  from  the  new  included  non-null  table  probabilities. 
For  the  coverage  probability  based  on  two-sided  ordinary  tests,  the  big  jump  has 
occurred  before  the  coverage  probability  based  on  two-sided  modified  tests  has  a big 
jump,  and  the  amount  of  increase  is  greater  than  that  of  two-sided  modified  tests. 
Also,  at  that  jump  point,  more  new  tables  are  included  for  the  coverage  probability 
based  on  two-sided  ordinary  tests  than  the  coverage  probability  based  on  two-sided 
modified  tests. 

We  have  observed  similar  results  using  other  sets  of  fixed  margins.  In  particular, 
for  the  two-sided  approach,  for  large  |log6»|,  the  true  coverage  probability  has  0.95 
as  a lower  bound  rather  than  0.975.  For  the  proof,  let  be  the  ordinary 

confidence  interval  based  on  the  two-sided  test.  Suppose  that  the  value  of  the  upper 
limit,  0_).,  is  large  enough  so  that  all  of  the  lower  limits  from  other  possible  tables  are 
less  than  Then  at  0 = 6*+  we  have  Yl{t  p(t-,e+)<p{to-,e^)}  accordingly, 

i ; P(t-,e+)>p{to-,B+)}  -P(^)  ^-1-)  > 1 — a.  At  0 the  coverage  function  is 

C{9+)  = ^I{t,9+)P{t-e^) 

t 

= T. 

{t  : P{t-,B+)>P{to-,e+)} 

> 1 - a, 

since  at  0 = 9^,  we  have  Yl{t  ■.  p(t-,B+)<p{to;B+)}  ^ some  value  of  T = F, 

the  fact  that  9+  is  within  this  interval  corresponds  to  . p^f,B+)<p{t'-B^)}  9^)  > a. 

In  order  to  satisfy  this,  we  need  to  have  P{t'\  9p)  > P{to]  9^).  Then  the  two-sided 
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ordinary  P-value  is  larger  than  cr  at  T = t'.  Hence,  the  coverage  probability,  which 
is  the  summation  over  t such  that  P{t;9+)  > P{to-,9+),  is  at  least  1 - a.  Also  for 
9 > 9+  the  coverage  function  has  P{9)  > 1 — cv. 

For  a special  case,  suppose  that  P{t]  9+)  > P{to\  9^)  for  all  t > Then  at  6>  = 9^, 

P{9^)=  Y.  F(i;0+)  = = «• 

{t  ■.  p(f,e^)<p(to-,e+)}  t<to 

Accordingly,  we  have  P{t',  9+)  = I - a.  Then  the  coverage  function  at  9 = 9+ 

is 


C(K)  = Ziit,K}P{t-,n+) 

t 

= E P(>-,op 

t>to+l 

= 1 - a, 

since  at  some  value  of  T = t',  the  fact  that  9^  is  within  this  interval  corresponds  to 
P(T  < t ; 0_|_)  > a.  This  requires  T > to  + 1,  since  P[T  < toj  ^+)  = Hence  the  cov- 
erage function  has  C{9^)  > 1 — a.  This  relates  to  the  property  mentioned  previously, 
by  which  an  interval  endpoint  for  the  two-sided  approach  with  error  probability  a 
can  equal  one  for  the  one-sided  approach  with  error  probability  2a. 

So  far,  we  have  used  the  coverage  probability  to  compare  the  methods  of  con- 
structing the  confidence  interval.  An  alternative  way  to  compare  them  is  to  compute 
the  expected  length  of  confidence  intervals  for  9 or  for  log  9.  A complication  results 
from  infinite  endpoints  that  occur  at  T = or  T = Figure  2.15  displays 

the  expected  length  of  confidence  intervals  for  0,  for  four  methods,  using  the  margins 
of  Table  2.1.  The  two-sided  modified  confidence  interval  has  the  smallest  expected 
length,  uniformly  for  all  9.  For  instance,  the  expected  lengths  at  9 = 1 are  21.84, 
17.22,  13.78,  and  11.21  for  one-sided  ordinary,  one-sided  modified,  two-sided  ordinary, 
and  two-sided  modified  intervals,  respectively.  For  this  figure,  we  arbitrarily  set  the 
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upper  limit  equal  to  1000  whenever  T = tmax-  Since  the  expected  length  depends 
on  the  upper  limit  at  T = tmax,  that  value  was  chosen  to  be  almost  two  times  the 
maximum  finite  upper  limit  among  the  four  methods.  Figure  2.16  presents  the  anal- 
ogous expected  length  of  confidence  intervals  for  log  6,  using  the  margins  of  Table 
2.1.  Again,  the  two-sided  modified  confidence  interval  has  uniformly  the  smallest 
expected  length.  We  use  1.0  x 10-^  for  the  lower  limit  of  at  T = and  1000 
for  the  upper  limit  of  0 at  T = fniax-  Figures  2.17  and  2.18  give  analogous  displays 
using  the  margins  of  table  2.1,  comparing  the  lengths  conditional  on  T ^ or  f,nax- 
Then,  the  expected  length  does  not  depend  on  the  values  of  the  lower  limit  at 
and  the  upper  limit  at  T = f„iax.  Again,  the  two-sided  modified  confidence  interval 
has  uniformly  the  smallest  expected  length. 


2-4.3 The  One-Sided  Mid  P Confidence  Interval 


For  confidence  intervals  for  a common  odds  ratio  based  either  on  inverting  two 
separate  one-sided  tests  or  inverting  a two-sided  test,  one  can  construct  even  narrower 
intervals,  albeit  not  “exact”  ones,  by  inverting  the  tests  based  on  the  modified  mid  P 
value.  The  ordinary  mid  P confidence  limits  based  on  inverting  two  separate  one-sided 
tests  are  found  using  the  functions 

r„.id(.)(»)  = p,(0)-ip(«.;0)), 

Pniid(2)(^)  = ~ 2 P(Ci  ^))-  (2-U) 

The  limits  are  determined  by  the  same  method  used  for  the  modified  exact  confidence 
interval,  using  P„iid(i)(^)  for  the  lower  limit  and  P„ud(2)(^)  for  the  upper  limit.  Though 
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approximate,  this  type  of  confidence  interval  based  on  the  ordinary  mid  P-value  has 
been  observed  empirically  to  behave  well  (Mehta  and  Walsh  1992). 

Following  the  modified  approach  based  on  using  a one-sided  modified  mid  P- 
value,  let  Bi{0)  = {Z  : Z G P,r  = to,T'{6)  = The  modified  mid  P confidence 

interval  based  on  inverting  two  separate  one-sided  tests  uses 

p,:ud(2)W  = pm-\p[Bx(«)\e).  (2.18) 

The  limits  are  chosen  by  the  same  method  used  for  the  modified  exact  confidence 
interval,  using  for  the  lower  limit  and  7W(j(2)(^)  upper  limit.  This 

approach  tends  to  give  narrower  intervals  than  obtained  by  inverting  the  one-sided 
test  with  the  ordinary  mid  P-value.  We  illustrate  these  confidence  intervals  for  the 
common  odds  ratio  using  Tables  2.1  and  2.2.  For  Table  2.1,  the  95%  confidence 
interval  by  inverting  a one-sided  test  is  (1.34,  266.54)  based  on  the  ordinary  mid 
P-values  and  (2.22,  56.00)  based  on  the  modified  mid  P-values  using  E^l(^)  or  the 
table  probability  for  V.  Using  Table  2.2,  the  confidence  intervals  are  (0.98,  16.89) 
using  the  ordinary  mid  P-values,  (1.01,  13.61)  using  the  modified  mid  P-values  with 
S (1-04,  14.85)  using  the  modified  mid  P-values  with  the  table  probability 

for  r . 


2.4.4  The  Two-Sided  Mid  P Confidence  Interval 


As  the  two-sided  approach  tends  to  give  an  interval  that  is  usually  narrower  than 
the  one  based  on  inverting  two  separate  one-sided  tests,  we  can  construct  a shorter 
interval  using  two-sided  mid  P-values.  Though  these  cannot  guarantee  achieving  at 
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least  the  nominal  confidence  level,  one  could  define  mid  P versions  of  the  ordinary 
two-sided  and  modified  two-sided  intervals.  For  testing  a particular  value  of  0,  a 
two-sided  mid  P-value  can  be  defined  as 

= P(e)-^-P{{Z:Z^T,P{t-e)  = Pit,-9)}).  (2.19) 

The  limits  are  determined  by  the  same  method  used  for  the  two-sided  exact  confidence 
interval. 

Following  the  modified  approach,  one  can  construct  a modified  confidence  interval 
based  on  two-sided  tests  by  using  a modified  mid  P-value.  We  define  a modified  two- 
sided  mid  P-value  for  testing  a particular  value  of  9 as 

p:.M  = P’"{^)-\p{{^-Zer,P{t-,e)  = p{p-,0),r{9)  = t',{e)}).{2:2O) 

Also,  the  limits  are  determined  by  the  same  method  used  for  the  two-sided  exact 
confidence  interval.  We  illustrate  these  confidence  intervals  for  the  common  odds 
ratio  using  Tables  2.1  and  2.2.  For  Table  2.1,  the  95%  confidence  interval  by  inverting 
a two-sided  test  is  (1.38,  131.51)  based  on  the  ordinary  mid  P-values  and  (1.38,  35.51) 
based  on  modified  mid  P-values  using  T'  = T.Xl{9).  Using  Table  2.2,  the  confidence 
intervals  are  (1.01,  12.58)  and  (1.01,  10.29)  using  the  ordinary  and  modified  mid  P- 
values  with  T'  = EX^(0),  respectively.  For  these  data  sets,  the  confidence  interval 
constructed  by  using  the  ordinary  two-sided  mid  P-values  is  shorter  than  the  ordinary 
one  based  on  two  one-sided  mid  P-values.  For  each  type  of  interval,  the  modified 
interval  is  narrower  than  the  ordinary  one.  Table  2.4  summarizes  these  95%  confidence 
intervals  using  Table  2.1  and  Table  2.2. 

For  the  conditional  distribution  having  the  fixed  marginal  counts  of  Table  2.1, 
Figure  2.19  shows  the  actual  coverage  probability  as  a function  of  the  true  log  odds 
ratio,  for  the  95%  confidence  intervals  based  on  inverting  separate  one-sided  tests 
using  the  ordinary  mid  P-value  or  the  modified  mid  P-value  with  T'  = EX^(9).  The 
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exact  method  yields  a coverage  exceeding  the  nominal  level,  whereas  the  coverage  of 
the  mid  P-value  fluctuates  about  the  nominal  level.  For  either  approach,  for  suffi- 
ciently large  |log6»|,  the  actual  probability  of  coverage  of  a 100(1  - a)%  confidence 
interval  is  centered  about  1 otj 2 and  that  of  the  modified  mid  P-value  deviates  less 
from  1 — a/2. 

Figure  2.20  gives  an  analogous  display  for  the  confidence  intervals  based  on  invert- 
ing two-sided  tests  using  the  ordinary  mid  P-value  or  the  modified  mid  P-value  with 
There  is  an  advantage  to  the  interval  based  on  the  modified  P-value. 
For  either  approach,  the  actual  probability  of  coverage  of  a 100(1  — a)%  confidence 
interval  is  centered  about  the  nominal  level,  and  that  of  the  modified  mid  P-value  is 
even  closer  to  the  nominal  level.  For  intervals  using  mid  P-values,  we  suggest  the  use 
of  the  confidence  interval  based  on  inverting  two-sided  tests  using  the  modified  mid 
P-value. 


Method 

Data  set  1 

Data  set  2 

Exact  Cl 

Ordinary  1-sided  P 
Modified  1-sided  P (P*) 
Modified  1-sided  P (P*) 

1.08,  531.51 

2.08,  67.35 
2.08,  67.35 

0.86,  21.37 
1.01,  13.63 
1.04,  14.87 

Ordinary  2-sided  P 
Modified  2-sided  P (P*) 
Modified  2-sided  P (P*) 

1.29,  261.49 
1.38,  40.45 
1.38,  40.45 

0.88,  15.92 
1.01,  10.30 
1.01,  11.14 

Approximate  Cl 

Mantel-Haenszel 

ML 

1.03,  47.73 
1.28,  128.12 

0.86,  12.93 
0.99,  17.64 
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Table  2.4.  Various  95%  confidence  intervals  for  the  common  odds  ratio  using  mid 
P-value. 


Method 

Data  set  1 

Data  set  2 

Approximate  Cl 

Ordinary  1 -sided  mid  P 

1.34, 

266.54 

0.98, 

16.89 

Modified  1-sided  mid  P (P*) 

2.22, 

56.00 

1.01, 

13.61 

Modified  1-sided  mid  P {P*) 

2.22, 

56.00 

1.04, 

14.85 

Ordinary  2-sided  mid  P 

1.38, 

131.51 

1.01, 

12.58 

Modified  2-sided  mid  P {P*) 

1.38, 

35.51 

1.01, 

10.29 

COVERAGE  P 


LOG  THETA 


Figure  2.9.  Coverage  probability  for  confidence  intervals  based  on  inverting  one-sided 
tests  with  T'  = for  conditional  distribution  based  on  margins  of  Table  2.1. 
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-4  -2  0 2 4 

LOG  THETA 


Figure  2.10.  Coverage  probability  for  confidence  intervals  based  on  inverting  one- 
sided tests  with  T'  = P{Z),  for  conditional  distribution  based  on  margins  of  Table 
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COVERAGE  P 
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K=6 

COVERAGE  P 
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K=9 

COVERAGE  P 
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K=12 


-4-2  0 2 

LOG  THETA 


Figure  2.11.  Coverage  probability  for  confidence  intervals  based  on  inverting  one- 
sided tests  with  T'  = ^Xl(0),  for  conditional  distribution  based  on  first  K partial 
tables  of  Table  2.2. 
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LOG  THETA 


Figure  2.12.  Coverage  probability  for  confidence  intervals  based  on  inverting  two- 

sided  tests  with  T'  = ^ X^(9),  for  conditional  distribution  based  on  margins  of  Table 

^ • 1 • 
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Figure  2.13.  Coverage  probability  for  confidence  intervals  based  on  inverting  two- 
sided  tests  with  T'  = P(Z),  for  conditional  distribution  based  on  margins  of  Table 
^ . 1 . 
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Figure  2.14.  Coverage  probability  for  confidence  intervals  based  on  inverting  two- 
sided  tests  with  T'  = '£,Xl{0),  for  conditional  distribution  based  on  first  K partial 
tables  of  Table  2.2. 
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LENGTH  (THETA) 


Figure  2.15.  Expected  length  of  confidence  intervals  for  6,  with  T'  = for 

conditional  distribution  based  on  margins  of  Table  2.1. 
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LENGTH(LOG  THETA) 


Figure  2.16.  Expected  length  of  confidence  intervals  for  log  0,  with  T'  = 
for  conditional  distribution  based  on  margins  of  Table  2.1. 
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LENGTH  (THETA) 


Figure  2.17.  Expected  length  of  confidence  intervals  for  conditional  on  T ^ or 
^max5  with  T'  = for  conditional  distribution  based  on  margins  of  Table  2.1. 
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LENGTH(LOG  THETA) 


Figure  2.18.  Expected  length  of  confidence  intervals  for  log  6,  conditional  on  T ^ t.„i„ 
^max?  with  T — S ^1(^)5  for  conditional  distribution  based  on  margins  of  Table 
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LOG  THETA 


Figure  2.19.  Coverage  probability  for  confidence  intervals  based  on  inverting  one- 
sided tests  using  mid  P-values  with  T'  = for  conditional  distribution  based 

on  margins  of  Table  2.1. 
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COVERAGE  P 


LOG  THETA 


Figure  2.20.  Coverage  probability  for  confidence  intervals  based  on  inverting  two- 
sided  tests  using  mid  P-values  with  T'  = Y^X^(6),  for  conditional  distribution  based 
on  margins  of  Table  2.1. 
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2.5  Connections  with  Logistic  Regression 


Consider  a set  of  independent  binary  variables,  Vj,  • • • , Y^.  Corresponding  to  each 
variable,  Yj,  there  is  a (p  x 1)  vector  = (xij,---  ,Xpj)'  of  explanatory  variables. 
Let  TTj  be  the  probability  that  Yj  — 1.  Suppose  that  the  response  is  related  to  the 
explanatory  variables  by  the  logistic  regression  model, 


log 


7T, 


1 - 7Tj- 


= 7 + x'/3. 


(2.21) 


The  likelihood  function  is 


exp[S^^iP,(x;.^  + 7)] 
n”=i[l  + exp(x'/3  + 7)]  ■ 


The  p X 1 vector  of  sufficient  statistic  for  /3  is  t = . 

Suppose  p = 2,  and  we  want  to  conduct  inferences  about  Again,  one  can 

eliminate  (32  by  conditioning  on  its  sufficient  statistic,  ^2  — One  can  treat 

the  data  for  the  logistic  regression  model  as  a three-way  2 x / x K tables  where  I 
and  K are  the  number  of  distinct  values  of  the  explanatory  variables,  Xx  and  A2, 
respectively. 

Exact  inference  in  logistic  regression  often  is  highly  discrete,  even  degenerate. 
One  can  often  alleviate  this  problem  somewhat  by  treating  the  data  as  a contingency 
table  and  using  the  alternative  way  discussed  in  Section  2 of  constructing  P-values.  To 
illustrate,  for  Table  2.1  we  let  -k^  denote  the  probability  of  cure  for  the  jth  individual  at 
the  tth  penicillin  level.  The  logistic  model  has  form  log  = 7,-  + i = 1,  • • • ,3, 

where  is  a dummy  variable  for  delay.  The  observed  value  of  the  sufficient  statistic  T 
is  14.  For  testing  Ho  : /3  = 0,  the  exact  one-sided  P-value  \s  P = P{T  > 14)  = 0.0200. 
The  modified  exact  P-value,  using  T = E^K^)  or  the  table  probability,  is  0.0028. 
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2.6  Discussion 


We  have  shown  that  use  of  a modified  P-value  leads  to  exact  tests  and  confidence 
intervals  that  are  less  conservative  than  the  usual  ones.  The  improvement  can  be 
considerable  when  K is  large  but  n is  not,  in  which  case  there  may  be  a large  number 
of  tables  with  the  different  secondary  statistic  value  that  have  the  same  primary  test 
statistic  value. 

We  prefer  modified  exact  tests  and  confidence  intervals  over  the  ordinary  exact 
ones,  because  they  are  less  conservative  than  the  ordinary  ones  but  still  guarantee  at 
least  the  nominal  level.  We  prefer  confidence  intervals  based  on  inverting  two-sided 
tests  over  those  based  on  inverting  two  separate  one-sided  tests,  because  they  tend  to 
be  less  conservative.  Likewise,  for  confidence  intervals  using  mid  P-values,  we  prefer 
intervals  based  on  inverting  two-sided  tests  using  modified  mid  P-values. 

For  the  secondary  statistic,  we  have  used  Y.k  and  the  table  probability  in  our 
examples,  and  clearly  the  reduction  in  conservativeness  occurs  with  test  statistics 
for  more  general  alternatives.  A FORTRAN  program  has  been  prepared,  designed 
for  IBM-compatible  PCs  or  UNIX  workstations,  for  computing  modified  P-values  for 
tests  of  conditional  independence  and  modified  confidence  intervals  for  an  assumed 
common  odds  ratio.  This  program  also  computes  the  actual  coverage  probability  and 
the  expected  length  of  confidence  intervals  using  four  methods.  This  program,  for 
2 X 2 X A tables,  is  an  adaptation  of  one  written  by  Vollset  and  Hirji  (1991)  for 
ordinary  exact  inference  for  such  tables.  The  Appendix  A contains  the  FORTRAN 


source  code. 


CHAPTER  3 

APPROXIMATING  EXACT  INFERENCE  ABOUT  CONDITIONAL  ASSOCIATION 


3.1  Introduction 


For  three-way  tables,  consider  the  hypothesis  of  conditional  independence  of  X 
and  y , given  Z . This  hypothesis  is  usually  tested  against  the  alternative  of  no  three- 
factor  interaction.  The  general  alternative  that  permits  three-factor  interaction  is  the 
general  loglinear  model  for  a three-way  table  and  has  the  form 

log  = /i  + Af  -f  A[  + Af  + AfX  + Af/  + aJ/  + (3. 1 ) 

When  X or  V are  ordinal,  narrower  alternatives  can  be  constructed  for  the  exact 
tests. 

We  suggest  exact  inference  regarding  conditional  associations  in  three-way  con- 
tingency tables.  For  I x J x K tables,  we  discuss  six  test  statistics  for  conditional 
independence  that  have  natural  connections  with  loglinear  models  for  various  alter- 
natives. We  use  a simulation  algorithm  to  obtain  precise  estimates  of  exact  P-values 
for  cases  that  are  currently  computationally  infeasible. 

For  three-way  contingency  tables,  current  computational  algorithms  for  the  exact 
methods  are  restricted  to  certain  analysis  for  2 x J x K tables.  Also  when  the  sample 
size  is  small  or  when  the  contingency  tables  are  sparse,  large-sample  approximations 
can  be  questionable  to  apply.  The  Monte  Carlo  method  is  an  alternative  to  either 
the  exact  or  asymptotic  methods.  This  method  is  based  on  estimating  the  exact 
conditional  sampling  distribution  of  the  statistic,  by  generating  random  tables  having 
the  relevant  fixed  margins.  The  advantage  of  this  method  is  that  the  number  of  tables 
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generated  is  fixed  in  advance,  and  the  computing  time  does  not  depend  greatly  on 
the  sample  size  n and  the  table  size,  compared  to  methods  for  exact  analysis.  For 
the  random  table  generation,  we  use  the  procedure  by  Patefield  (1981)  that  simulates 
hypergeometric  distributions. 

Section  2 discusses  exact  tests  of  conditional  independence  in  / x J x tables 
using  three  statistics  that  are  popular  for  asymptotic  tests.  These  are  naturally  linked 
to  alternatives  corresponding  to  loglinear  models  that  assume  a lack  of  three-factor 
interaction.  Section  3 presents  three  other  statistics  that  do  not  require  this  assump- 
tion. All  SIX  test  statistics  are  score  statistics  for  loglinear  models  that  treat  none, 
one,  or  both  of  the  classifications  as  ordinal.  Section  4 discusses  possible  alternative 
ways  of  forming  modified  exact  P-values  in  / x J x K contingency  tables,  generalizing 
the  modified  P-value  discussed  in  Chapter  2.  We  propose  modified  exact  P-values  for 
six  tests  for  testing  conditional  independence  with  I x J x K tables. 

Computational  algorithms  have  limited  availability  for  tests  of  conditional  inde- 
pendence when  / and  J exceed  two.  Section  5 describes  a Monte  Carlo  sampling 
routine  that  approximates  the  ordinary  and  modified  exact  P-values.  We  utilize  six 
test  statistics  for  exact  tests  of  conditional  independence.  Section  6 illustrates  approx- 
imate exact  tests  of  conditional  independence  with  examples,  and  Section  7 explains 
a FORTRAN  program  utilizing  the  simulation  algorithm. 


— Tests  of  Conditional  Independence  Assuming  No  Three-factor  Interaction 


This  section  presents  three  test  statistics  for  testing  conditional  independence  of 
and  R,  given  Z,\n  I x J x K contingency  tables,  proposed  by  Birch  (1965).  We 
present  loglinear  models  for  which  these  are  score  statistics.  These  models  assume  a 
lack  of  three-factor  interaction.  We  then  present  three  adaptations  of  these  statistics 
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that  do  not  require  that  assumption  in  the  next  section.  In  each  case,  one  test  treats 
both  X and  Y as  nominal,  one  test  treats  X as  nominal  and  Y as  ordinal,  and  one 
test  treats  both  as  ordinal. 

The  asymptotic  chi-squared  theory  is  well  developed  for  the  statistics  we  present. 
Our  focus  will  be  to  construct  exact  tests  of  conditional  independence,  using  these 
statistics  with  the  reference  set  F of  tables  with  the  same  margins.  We  use  score 
statistics  for  loglinear  models  rather  than  likelihood-ratio  or  Wald  statistics.  This 
makes  the  computations  for  exact  analyses  simpler,  since  one  does  not  need  to  fit  the 
model  for  each  table  in  T. 

3.2.1  Nominal-bv-Nominal  Test 

Birch  (1965),  Landis  et  al  (1978),  and  Mantel  and  Byar  (1978)  generalized  the 
Cochran-Mantel-Haenszel  statistic  to  handle  more  than  two  groups  or  more  than  two 
responses.  Suppose  X and  Y are  nominal.  Let  n^.  denote  the  counts  for  cells  in  the 
first  I - 1 rows  and  J - 1 columns  for  stratum  k of  Z.  Conditional  on  the  row  and 
column  totals  in  that  stratum,  let  denote  the  null  expected  value  of  Then 
d = SA,.(nfc  - nifc)  represents  the  (/  - 1)(J  - 1)  x 1 vector  having  elements. 


z = l,---,/-l  j = l,...,J_i.  (3.2) 


Let  Xk  denote  the  null  covariance  matrix  of  n^,  where 


Cov(rijj^, 


1^i+k{^n''>l++k  '>^i'+k}n^jk[6jjin^^k  ~ '^^+j'k) 

nl^k{n++k  - 1) 


(3.3) 
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Then  V - T,k\k  is  the  null  covariance  matrix  of  d.  The  efficient  score  statistic  for 
testing  conditional  independence  against  the  alternative  of  no  three-factor  interaction 
is 

C'^  = d'V-^d.  (3.4) 

This  is  also  called  the  generalized  Cochran- Mantel- Haenszel  statistic.  Under  condi- 
tional independence,  this  statistic  has  a large  sample  chi-squared  distribution  with 
= For  A = 1 stratum  with  n observations,  the  statistic  reduces  to 

the  multiple  (n  — l)/n  of  the  Pearson  chi-squared  statistic  for  testing  independence. 

The  statistic  is  sensitive  to  detecting  conditional  associations  when  the  asso- 
ciation is  similar  in  each  stratum.  Hence,  the  generalized  Cochran-Mantel-Haenszel 
statistic  has  low  power  for  detecting  an  association  in  which  the  patterns  of  associa- 
tion for  some  of  the  strata  are  in  the  opposite  direction  of  the  patterns  displayed  by 
other  strata,  relative  to  the  case  that  the  association  is  similar. 


3.2.2  Ordinal-bv-Ordinal  Test 


When  X and  Y are  ordinal,  it  often  makes  sense  to  test  against  a narrow  al- 
ternative, corresponding  to  a monotone  trend  in  the  conditional  association.  It  then 
makes  sense  to  form  a test  statistic  using  a model  that  is  a special  case  of  the  no  three- 
factor  interaction  model  and  reflects  the  ordinality,  such  as  the  model  of  homogeneous 
linear- by-linear  association, 

log  rriijk  = /^  + + f^UiVj  + + Xjif . (3.5) 

It  replaces  the  general  association  term  by  a linear- by- linear  term  ^UiVj,  where 
{u,}  and  {uj}  are  monotone  scores  for  levels  of  X and  Y.  The  parameter  /?  in  that 
model  describes  X — Y partial  association.  The  model  of  conditional  independence 
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of  X and  Y is  its  special  case  in  which  /3  = 0.  For  this  model,  the  sufficient  statistic 
for  is  Yik\YiYjUiVj7iijk'\.  When  I = J = 2^  the  usual  statistic  results  from 

the  scores  Ui  = Ui  = 1,  u-2  = U2  = 0.  This  is  the  Birch’s  exact  test  statistic  for  testing 
conditional  independence  in  2 x 2 x K contingency  tables,  and  we  have  utilized  this 
statistic  in  Chapter  2 for  the  conditional  exact  test.  Also,  Mehta,  Patel  and  Gray 
(198o)  and  Vollset,  Hirji  and  Elashoff  (1991)  used  this  statistic  to  implement  the 
exact  test. 

For  the  asymptotic  test  of  //„  : /3  = 0,  one  can  use  Mantel’s  (1963)  generalized 
statistic  for  detecting  association  between  ordinal  variables.  This  ordinal  test  focuses 
the  departure  from  independence  on  a single  degree  of  freedom.  Suppose  we  expect  a 
monotone  conditional  relationship  between  X and  V",  with  the  same  direction  at  each 
level  of  Z,  and  suppose  that  we  can  assign  monotone  scores  {ui}  to  levels  of  X and 
{vj}  to  levels  of  Y . Then  there  is  evidence  of  positive  trend  if,  within  each  stratum, 
the  statistic  YiYijUiVjriiji.  is  greater  than  its  expectation  under  independence. 

For  the  model  (3.5),  given  the  marginal  totals  in  each  stratum  and  under  condi- 
tional independence  of  X and  Y, 

E{YiY,UiV^mjk)  = 


Var(E, 


^++k  1 


X lE,vW„  - 


+ " 'f^++k 

To  summarize  the  correlation  information  from  the  K strata.  Mantel  (1963)  proposed 
the  statistic 


j^'2  ^ {Yk[T,,T,jUiVjnijk  - E{YiT,jUiVjnjjk)]y 

T,kXa.r{'Ei'EjUiVjnijk)  ' ' ’ 

This  is  the  score  statistic  for  testing  conditional  independence  for  model  (3.5).  It  has 
an  asymptotic,  chi-squared  distribution  with  df  = 1. 
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3.2.3  Nominal-bv-Qrdinal  Test 


Suppose  the  row  variable  X is  nominal  and  the  column  variable  Y is  ordinal.  A 
useful  loglinear  model  replaces  the  ordered  row  scores  in  model  (3.5)  by  unordered 
parameters  {/x,}, 

log  = /X  + Af  + Xj  + Af  + fiiv,  + Aj^.^  + AJ/.  (3.7) 

The  sufficient  statistics  for  {fii}  are  = I,--  - ,7.  These  can  be  interpreted 

as  the  row  sums  for  a response  Y within  each  level  of  X,  using  the  scores  {uj},  summed 
over  the  strata.  Assuming  the  model  holds,  we  can  test  conditional  independence  by 
testing  fly  = ^2  = ■■■  — fJ-i-  Let  Vi,  • • • , be  a random  sample  within  the  stratum 
k,  which  takes  scores  uj,  • • • , vj.  Let  I denote  the  (7  — 1)  x 1 vector  having  elements 

h — fTt)?  ('L8) 

where 


and 


— YjTiijkVjjrii^k,  h = I,--  - 


Note  that  Wn,  is  the  row  mean  on  Y at  level  i of  X and  level  k of  Z,  treating  Y as 
a response  with  scores  {uj}.  Similarly,  Wk  is  the  Arth  stratum  mean  for  Y.  Let  A 
denote  the  null  covariance  matrix  of  1,  which  has  elements 


+ + ^ -tj 


(3.9) 
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Then  the  efficient  score  statistic  for  testing  conditional  independence  against  the 
alternative  of  (3.7)  is  1 A ^1.  This  statistic  is  sensitive  to  location  differences  among 
the  I conditional  distributions  of  Y that  are  similar  at  each  level  of  Z . The  asymptotic 
null  distribution  is  chi-squared  with  df  = 1 — 1. 

The  three  statistics  just  discussed  were  suggested  by  Birch  (1965)  for  testing 
conditional  independence.  The  three  asymptotic  tests  are  available  in  SAS  (PROC 
FREQ). 

3.2.4  Generalized  Tests 


The  previous  three  statistics  are  special  cases  of  a general  statistic  proposed  by 
Landis  et  al.  (1978).  Let  iik  denote  a column  vector  of  the  cell  counts  in  stratum  k, 
and  let  ni^.  denote  their  expected  values.  Also  let  Ri+it  denote  the  marginal  proportion 
of  zth  row  and  let  P denote  the  marginal  proportion  of  jth  column.  We  introduce 
the  following  notation  to  define  the  generalized  test  statistic. 

'^ik  id^i\ki  ■ ■ ■ 5 

^k  — i'^lki  ' 1 '^Ik) 

P+k  ^^i+k  P^++k 
P+jk  ^^+jkllT'++k 

p'  ^ ( p , JD  . p / ^1+*:  ^2-t-fc  n, 

^ *+k  V-'  1+*: » ^2+ki  ' ' ' 1 ^I+k ) — ( , , ' ' ' j ) 

f^++k  1^-\--\-k  f^++k 

p P \ / '^^+lk  n.^2k  nj^Jk  ^ 

^-\-*k  — K^+lk,  r+2k:  • • • , t^+Jk)  = ( , , • ■ ■ , ) 

f^++k  'IT'+^k  1^++k 
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Assume  that  cell  counts  from  different  strata  are  independent.  Landis  et  al.  (1978) 
showed  that  under  the  hypothesis  of  conditional  independence,  the  expected  value  and 
covariance  matrix  of  the  frequencies  are,  respectively. 


rrik  = E[rik\Ho]  = n++k{P*u  ® P+*k) 


(3.10) 


and 


Var[nA,|i/o]  = 


_ K+k 


[{D 


Pt+k  ^*+l^^*+k) 


(3-11) 


where  ® denotes  Kronecker  product  multiplication  and  Da  is  a matrix  with  elements 
of  a on  the  main  diagonal. 

The  generalized  statistic  for  testing  conditional  independence  is  defined  as 

Qm  = G'VqG,  (3.12) 

where 


and  where 


G = EkBk{nk  - ruk) 

Vq  = EkBk[\&r{nk\Ho)]B’f,, 


Bk  — B-k  ® Ck 


is  a matrix  of  fixed  constants  based  on  row  scores  Rk  and  column  scores  Ck  for  the 
kth  stratum.  When  the  null  hypothesis  is  true,  the  statistic  Qm  is  approximately 
distributed  as  chi-squared  with  degree  of  freedom  equal  to  the  rank  oi  Bk- 

Suppose  the  row  variable  X is  nominal  and  the  column  variable  Y is  ordinal. 
Then  mean  score  of  Y is  meaningful.  In  this  case,  the  mean  score  is  computed  for 
each  row  of  the  table,  and  the  alternative  hypothesis  is  that,  for  at  least  one  stratum, 
the  mean  scores  of  the  / rows  are  unequal.  Then  the  statistic  is  sensitive  to  location 
differences  among  the  / distributions  of  Y. 
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For  this  case  we  can  define  the  matrix  Rk  that  has  dimension  (/  — 1)  x / as 

(3.13) 

where  Ij_i  is  an  identity  matrix  of  rank  / - 1,  and  J/_i  is  an  an  (/  - 1)  x 1 vector 
of  ones.  The  matrix  has  the  effect  of  forming  7—1  independent  contrasts  of  7 mean 
scores.  The  matrix  Cj^  has  dimension  1 x T,  and  the  scores  are  specified  as  one  for 
each  column.  Then  sums  over  the  K strata  information  about  how  7 row  means 
compare  to  their  null  expected  values,  and  it  has  d/  = 7 - 1. 

When  both  variables  are  ordinal,  R^  and  C^.  can  be  defined  as  R^  = (ui,  • • • , u/), 
and  Cfc  = (uj,---  ,uj).  If  the  scores  R^  and  C}.  are  the  same  for  all  strata,  Qm 
simplifies  to  M'^ . 

When  both  variables  are  nominal,  Rk  = - J/_i),  and  Ck  = 

can  be  used.  Then  Qm  simplifies  to  d'V~^d  with  df  = {I  - 1)(J  - 1). 

For  exact  tests  of  conditional  independence  in  I x J x K tables,  we  discussed  test 
statistics  assuming  a lack  of  three-factor  interaction.  These  are  score  statistics  for 
loglinear  models  that  treat  none,  one,  or  both  of  the  classifications  as  ordinal.  Also 
they  have  asymptotic  chi-squared  distributions. 

— Tests  of  Conditional  Independence  Permitting  Three-factor  Interaction 


The  tests  discussed  so  far  assume  no  three-factor  interaction.  Suppose,  instead, 
we  expect  the  nature  of  the  association  between  X and  Y to  vary  considerably  across 
levels  of  Z . Then  one  would  test  against  an  alternative  that  permits  the  association 
to  vary  across  the  strata  of  Z. 


73 


3.3.1  Nominal-bv-Nominal  Test, 


Suppose  and  Y are  nominal.  Then  one  could  test  conditional  independence 
against  the  saturated  loglinear  model,  since  the  only  more  general  model  is  the  satu- 
rated model.  An  efficient  score  statistic  is  the  Pearson  statistic  for  testing  conditional 
independence  against  the  alternative  of  the  saturated  model  (Agresti  1992).  Letting 
denote  the  Pearson  statistic  for  testing  independence  within  the  kth  level  of  Z, 
this  statistic  is  The  asymptotic  distribution  of  this  statistic  is  chi-squared 

with  df  = K{I  - 1)(J-  1),  since  at  each  partial  table  Xj  has  asymptotic  chi-squared 
distribution  with  df  = (/  — 1)(J  — 1),  and  we  have  K independent  partial  tables. 
Also,  this  is  the  df  for  testing  a loglinear  model  of  conditional  independence  against 
the  most  general  alternative. 

3.3.2  Ordinal-bv-Ordinal  Test 


The  model  of  homogeneous  linear-by-linear  association  (3.5)  allows  association  be- 
tween two  ordinal  variables  in  each  table  and  this  association  is  homogeneous  across 
levels  of  Z.  When  X and  Y are  ordinal,  one  sometimes  expects  a monotone  asso- 
ciation between  X and  Y that  changes  strength  across  levels  of  Z.  We  consider  a 
loglinear  model  that  permits  association  between  X and  Y within  each  level  of  Z, 
but  heterogeneity  among  levels  of  Z,  and  the  degree  of  heterogeneity  is  explained 
by  its  association  parameter.  A relevant  loglinear  model  is  then  the  heterogeneous 
linear-by-linear  association  model. 


log  mijk  = ,,  + Af  + A]"  + Af  + fitu.v,  + A^^  + A]'/. 


(3.14) 
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For  this  model,  the  null  hypothesis  of  conditional  independence  is  Hq  : /?j  = ■ • • = 
I^K  = 0.  The  loglikelihood  is 


L{m)  = ^ ^ log  ^ ^ ^ 

j k i 3 k 

= E E E "..*(/*  + >~f  + At  + Af  + h„,v,  + + Aj/)  - E E E 

* ^ ^ . j fc 

— + X/  ^i++  + E ^*'+:?+  + X!  ^*++fc  + H ^A;  XllZ  UiVjtlijk 

^ ] k k t j 

+ EEAf/.w  + EEAkV,/.-EEE’««»-  (s.is) 

* A:  j A:  i j k 

For  this  model  the  sufhcient  statistic  for  is  EiEjUiVjTiijk.  For  A;  = 1,  • • • , A", 
the  derivative  of  the  loglikelihood  is 


dL{m) 

d^k 


EE  U^V  jTXijf^  EE  Vj  Tll^j  /j , 

* J i 


Under  the  hypothesis  of  conditional  independence,  we  have  mijk  = !h±±I!±2jL_  Hence, 
for  A;  = 1,  ■ • • , h\ 


dL{m) 


EE  UiVji^Tl^jk  '^ijk^ 

i J 


= V Vu  n fn- 

t 3 

i j P++fc 

Let  s denote  the  K x 1 vector  having  elements 


Sk 


Ej  ^jU{Vj  (^Pijk 


Pi+kP+jk  X 
P++k 


I 

n 


X]j  ^jU^Vj  {p^ijk 


^i+kf^+jk  ^ 
'kl++k 


(3.16) 
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Then  s can  be  defined  as 


s 


St  Sj  ^i'^ji^Pijk 
_St  Sj  ^i^j{PijK 


Pi+lP+)l  \ 
P+  + 1 

P.+2P-4-72  \ 
P++2  ^ 

Pi+^P+tfc  \ 
P+  + * ' 

Pi+kP+iK  •< 
P+  + K ' 


St  Sj 

Si  Sj  ^t^j(^tj2 

1 

St  Sj  ^i^j{^ijk 
_St  Sj 


”»+l”+.il  \ ■ 
”++l  ^ 

»H+2n+i2_\ 
"++2  ^ 

^t+fc^+  tfc  \ 
«++*  ' 

^i+if»+t  Jr  \ 
"+  + A'  h 


For  fixed  fc,  let  G/;(7r)  — Si  Sj  ^i^i(7riiA;  — Let  g*;  represent  the  IJ  x 1 

T T ^ 

vector  having  elements 

Ski'll  j)  *7^  ^a'^a'l^^a+k){f^jf^++k  S6f^6^+6fc )]  j 

^++k 


and  let  gf  be  the  UK  x 1 vector  with  gf'  = gj- , For  example, 


Si 


D 


agi(7T) 

Ott 


(Mi7T+  + i - Sa^‘a7ra+l)(ni7T+  + i - S& ) 
(uiTr+4.1  - J2a  Ua7ra+l)(n27T++i  - ^^67T+fci) 

(Uj7r_|__f_i  — Sa  ^a^a+l  )(Wj7r.|__(.i  S6^(>^+6l) 

(U/7T++1  - Ea  ^^a7ra+l)(uj7T++i  - J2b  ^bT^+bl) 
0(K-1)IJ 
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(■Uin++i  - u„n„+i)(^;in++i  - ^bn+bi) 

(uin++i  ~ J2a’^ana+-l){v2Tl^^i  ~Z^6^6^+6l) 

{UiTl^+l  - Ea  ^a«a+l)(Vjn++i  - E&  ^^6^+61  ) 

(u/n++i  - Ea  ^a«a+l)(vjn++i  “ E&  ^6«+6l  ) 

0(A'-1)/J 

= gl 

0(A'-1)/J 

and 


n 


++1 


Sk 


dGk(7r) 

d-TT 


1 

^++k 


0(fc-i)/j 

(ui7T++fc  - Ea  't^a7ra+fc)(ni7r++fc  - E&  VbTT+bk) 
(Ui7T++fc  - Ea  ^<a7ra+A:)(f27r++/t  - E& 

Ea  ^a^a+/c)(^j^++A:  E&  ^6^+6/c) 


(u/7r++A;  - Ea^^a7Ta+A;)(^^j7r++*:  - E& 

0(AT-fc)/J 


0(fc-l)/J 

(ni«++^.  - Ea  ?^an„+A:)(t^ira++fc  - Eft  ^^6?^+ftft:) 
(uin++^.  - Ea  ^a?^a+fc)(w2«++A,-  ~ Eft^ft«+6fc) 

1 

^++k  (^t^++A;  ~ Ea  ^a^a+fc)(t^j^++fc  ~ Eft  ^ft^+ftfc) 
(u/n++A;  - Ea  Uana+k){vjn++k  ~ Eft  ^’ft^+ftfc) 

0(A'-yt)/J 


0(fc-l)/J 

gk 

P{K-k)IJ, 
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Also  let  D represent  the  K x UK  matrix  such  that  row  k consists  of  gf that  is 


-3Gi(7r)M 


D 


97T 


ogjUKl' 

L 97T  J 


The  null  asymptotic  covariance  matrix  of  s is  H = DSD'/n,  where  n = 

and  S = Diag{-p)  — pp'  with  p = { } • The  score  statistic  for  testing  Ho  : /3-i  = 

■ ■ ■ = f^K  = 0 is  then  s'H  ^s.  From  Rao  (1973,  page  418),  the  asymptotic  distribution 
of  s is  A -variate  normal.  Its  mean  is  zero  and  dispersion  matrix  is  the  information 
matrix.  Hence  the  asymptotic  distribution  of  is  chi-squared  with  df  = K. 

The  number  of  df  is  the  number  of  components  of  parameters  for  testing,  or  the  rank 
of  the  asymptotic  covariance  matrix. 


3.3.3  Nominal-bv-Ordinal  Test 


A loglinear  model  (3.7)  implies  there  are  row  effects  on  the  association,  and  these 
row  effects  are  the  same  for  each  level  of  Z.  In  general  cases  when  X is  nominal 
and  y is  ordinal,  we  might  expect  heterogeneity  in  the  row  effects  on  the  association. 
Then  a relevant  loglinear  model  to  allow  heterogeneity  across  the  strata  is 

log  rn,jk  = /r  + Af  + Xj  + Af  + fi.kVj  + Xf^^  + AJ/.  (3.17) 

The  model  is  sensitive  to  alternatives  whereby  means  on  Y vary  across  levels  of  both 
X and  Z . For  identiliability,  we  use  constraints  jiik  = 0.  For  this  model,  the 
null  hypothesis  of  conditional  independence  is  //q  : /ii/t  = 0 for  i = 1 , • • • , / — 1 and 
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k — 1,  • • • , . The  loglikelihood  i 


IS 


^ 3 ^ i j k 


+ jJ'ikVj  + 4=^  + 4")-EEE 


t 3 k 


f^^ijk 


i j k 


k'^j'f^ijk 


= rifi  + E ".++  + E d «+;+  + E Af»++^  + EE  E 

® 3 k i ] k 

+ EE^f/".+^  + EE4V,t-EEE™«*-  P.is) 

'‘  k j k i j k 

For  this  model  the  sufficient  statistic  for  is  For  fixed  i and  k,  the 

derivative  of  the  loglikelihood  is 


dL{m) 

dfx 


tk 


= '^Vjriijk-^Vjmijk. 


Under  the  hypothesis  of  conditional  independence,  we  have  Hence, 

for  fixed  i and  k. 


dL{m) 


^ / U j ( Tlij  f;  TTlij k ) 


n 


Y.^^ipijk 


''^++k 

Pi+kP-kjk 

P++k 


)• 


For  i — I,--  - ,/  1,  A:  — 1,  • • ■ , A , let  q be  the  K[1  — 1)  x 1 vector  having 

elements 


qik  = J2^j{Pvk 


Pi+kP+jk 

P++k 


), 


^++A: 


), 


n 


n,+k{Wik  - Wk), 


(3.19) 
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where  W; 


Or  it  can 


k — and  Wk  — Yli  Y^j  nijkVjfn^^k-  Then  q can  be  defined  as 


E,  ’>,iPw  - 
E,  MPu-m  - 


be  written  as 


E,  v,{p 


Ej  ^j(p(t-i)jk 


P(I-l)+kP+jk  ■ 
P+  + k 


E,  Mp„k  - 

E,  ”,(P2,K  - 


E, ",(?{;-! 


)tK 


P+  + K 

P(7-i)+j<rP+jK  \ 
P+  + A'  h 


1 

n 


Ei 

E,  ^ =2;^) 


7HJ-i)+i”+ji 

"++1 


_ 'Hr-ij+fc’Hjfc  ■ 
«++* 


Ej  Wj(n(/_i 


)jK 


”(/-1)+AT7^+jA'  ' 
"+  + AT 
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For  fixed  i,  k,  let  Gik{Tv)  — Vj{-Kijk  — Let  r,fc  represent  the  IJ  x I vector 

having  elements 


- '^bVbn+bk){n++kSii>  - n^+k)],  i'  = I,--  - ,1, 

^^++k 


and  let  rg  be  the  UK  x 1 vector  with  rg'  = {0[k-i)u,rg, 


^[K-k)ij)-  That  is, 


^ik  — 


_ dGik{Tv) 


dn 


7T 


++A; 


0 


{k-\)IJ 


(ni7T++fc  - Yb'^bT^+bk){-T^i+k) 
{V2n+J^k  - Yb  VbT^+bk){-T^t+k) 

{vjTTj^^k  - Yb  VbT^+bk){-T^i+k) 


(^1^++A:  ^6^+6/t)(^++A;  ^j'+fc) 

(^2^++A;  ^2ib  ^b'^+bk'){'^-{-+k  '^i+k') 

{vjT^++k  - Yb  Vb7r+bk){T^++k  - TTi+fc) 


(ni7r++^  — Yb'^b'^+bk){—'^i+k) 
{v2TV++k  - Yb  Vb7r+bk){-T^i+k) 

(nj7T++fc  - Yb  VbT^+bk){-'n'i+k) 

0{K-k)lJ 
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Or 


{vin+^k  - Y.b  Vbn+bk){-rii+k) 

{v2Ti^+k  - J2b  Vbn+bk){-ni+k) 

(vjn++k  - E&  ^6«+5/c)(-«i+A:) 


Et  ^6^+6A;)(^++fc  ^^i+k') 

(^2^+4-A;  ^^6  ) (^-r+A: 

Et  ) (^++A  ^i-kk') 


{Vin^+k  — E&  ^6^+6A:)(  — «t+A:) 
(^2^++A;  E&  ^6^+6fc)(  ^i-b-k) 

{vjn++k  - T,bnn+bk){-ni+k) 

^(K-k)IJ 


0(A'-1)/J 

0(A'-A)/J. 


Also  let  E represent  the  A (/  1)  x UK  matrix  such  that  the  row  corresponding  to 

t,  k consists  of  rfj^' , that  is, 


E 


dGii(TV)'  1 

dTT 


97T 


dG,j<m' 

97T 


dTT 


The  null  asymptotic  covariance  matrix  of  q is  R = ESE'/n.  The  score  statistic  for 
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testing  Ho  . = 0 for  z = l, 1 and  A:  = 1,  • • • , /if  is  q'R~^q.  Its  asymptotic 

distribution  is  chi-squared  with  df  = K{I  — 1).  The  number  of  df  is  the  rank  of  the 
asymptotic  covariance  matrix  or  the  number  of  components  of  parameters  for  testing. 

For  exact  tests,  one  identifies  any  of  these  six  statistics  with  T in  the  calculation 
of  the  exact  P-value.  We  discuss  next  how  to  construct  modified  exact  P-values  for 
the  six  tests. 


'M The  Construction  of  the  Modified  Exact  P-vabie 


So  far,  we  have  discussed  six  test  statistics  for  testing  conditional  independence 
of  X and  V , given  Z,  in  three-way  contingency  tables.  The  ordinary  exact  P-value 
can  be  constructed  by  utilizing  these  statistics.  In  Chapter  2,  we  proposed  a modified 
exact  P-value,  to  reduce  the  degree  of  conservativeness.  It  is  based  on  both  the  usual 
test  statistic  and,  at  the  observed  value  of  T,  a secondary  statistic  T'  that  generates 
a secondary  partitioning.  The  statistic  T'  is  a statistic  directed  toward  a broader 
alternative.  Then,  T'  can  catch  some  information  about  the  validity  of  the  null 
hypothesis  when  the  assumed  alternative  for  T is  not  exactly  satisfied.  The  modified 
exact  P-value  is  defined  in  Chapter  2 as 

P*  = Ph,{T  > Q + Ph,{T  = U,  r > CJ, 

when  large  values  of  T and  T'  contradict  the  null.  We  have  shown  in  Chapter  2,  using 
2 X 2 X A tables,  that  the  modified  P-value  has  less  discrete  sampling  distributions, 
and  modified  tests  reduce  the  degree  of  conservativeness.  We  can  apply  this  modified 
approach  to  / x J x A tables  to  reduce  the  conservativeness  and  to  get  sharper  results. 

For  testing  conditional  independence  assuming  no  three-factor  interaction,  we 
denote  T-[  to  be  the  test  statistic  when  both  X and  Y are  nominal,  denote  T-2  to  be 
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the  test  statistic  when  X is  nominal  and  V is  ordinal,  denote  T2  to  be  the  test  statistic 
when  X is  ordinal  and  Y is  nominal,  and  denote  T3  to  be  the  test  statistic  when  both 
X and  Y are  ordinal.  Also,  let  T4,Ts,T^  and  Tq  be  the  corresponding  test  statistics 
when  we  permit  three-factor  interaction.  Note  that  these  are  score  statistics. 

In  this  section,  we  discuss  possible  alternative  ways  of  forming  modified  P-values 
for  testing  conditional  independence  for  1 x J x K tables.  Ordinary  exact  P-values  for 
these  six  tests  correspond  to  six  loglinear  models  for  primary  alternative  hypotheses. 
The  general  rule  to  construct  the  modified  exact  P-value  is  as  follows.  We  use  a 
score  statistic  for  T',  in  order  to  have  consistency.  If  there  is  only  one  potential 
statistic  for  T , we  use  that  one.  But,  if  there  is  more  than  one  potential  statistic, 
we  apply  a basic  principle  to  choose  a T'  among  them.  Now,  we  establish  basic 
principles.  We  can  consider  four  types  of  principles.  The  first  principle  is  to  choose 
a T from  the  next  most  general  alternative,  while  keeping  the  same  assumption  as 
T about  three-factor  interaction.  The  second  principle  is  to  choose  a T'  from  the 
most  general  alternative,  while  keeping  the  same  assumption  as  T about  three-factor 
interaction.  The  third  principle  is  to  choose  a T'  from  the  most  general  alternative 
among  all  cases.  The  fourth  principle  is  to  choose  a T'  while  keeping  the  nature  of 
the  classification  variables.  Next,  we  discuss  all  possible  statistics  for  T'  for  six  cases. 
Note  that  all  possible  potential  statistics  for  r are  Tj,  7^2,  T',  Ta,  ^4,  Ts,  T',  and  Te. 
We  first  consider  the  tests  assuming  no  three-factor  interaction. 

When  both  X and  Y are  nominal,  the  primary  test  statistic  T isTi.  The  secondary 
statistic  T'  can  be  T4,  since  T4  corresponds  to  a more  general  alternative  hypothesis. 
Second,  when  X is  nominal  and  Y is  ordinal,  T is  T2  and  T'  can  be  Ti,T4,  or  T5. 
Third,  when  both  X and  V are  ordinal,  T is  T3  and  T'  can  be  r„r2,Ti,r4,T5,T',or 
Tg.  Since  T3  is  constructed  from  the  narrowest  alternative,  the  other  statistics  can  be 
potential  statistics  for  T' . 
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Next,  we  assume  three- factor  interaction.  First,  when  both  X and  V are  nominal, 
T is  T4,  but  there  is  no  general  score  statistic  for  T\  since  T is  constructed  from  the 
most  general  alternative.  We  could,  however,  use  the  table  probability  for  T'  for  the 
secondary  partitioning.  Second,  when  X is  nominal  and  Y is  ordinal,  T is  T5  and  V 
can  be  T4.  Finally,  when  both  A'  and  Y are  ordinal,  T is  Tg,  and  T'  can  be  T4,  Tg  or 
Tg.  Table  3.1  summarizes  all  possible  statistics  for  V for  six  tests. 

We  see  two  cases  have  only  one  potential  statistic  for  T' . For  the  nominal-by- 
nominal  case  assuming  no  three-factor  interaction,  T'  is  T4.  Note  that  permitting 
three-factor  interaction,  nominal-by-nominal  case,  there  is  no  score  statistic,  but  we 
could  use  the  table  probability.  Also,  for  the  nominal-by-ordinal  case,  T'  is  T4.  For 
these  three  cases,  there  is  only  one  choice  for  T'.  For  other  three  cases,  we  apply  a 
basic  principle  in  order  to  choose  a T'  among  potential  statistics. 

For  the  first  principle,  we  choose  a T'  from  the  next  most  general  alternative, 
whde  keeping  the  same  assumption  as  T about  three-factor  interaction.  Assuming 
no-three  factor  interaction,  {T,T')  is  (T2,T-i)  for  the  nominal-by-ordinal  case,  since 
the  nominal-by-nominal  case  is  more  general,  and  it  also  corresponds  to  the  next  most 
general  alternative  assuming  no  three-factor  interaction  in  this  case.  For  the  ordinal- 
by-ordinal  case,  the  next  most  general  alternative  corresponds  to  the  nominal-by- 
ordinal  case  or  the  ordinal-by  nominal  case.  Hence  {T,T')  is  {T^,T2)  or  {Tz.T^).  Ac- 
cordingly, for  the  ordinal-by-ordinal  case  permitting  three-factor  interaction,  {T,T') 
is  (T6,r5)or  (T6,r'). 

The  second  principle  is  to  choose  a T'  from  the  most  general  alternative  among 
three  cases,  while  keeping  the  same  assumption  as  T about  three-factor  interaction. 
Then,  assuming  no-three  factor  intercation,  the  corresponding  statistics  for  (T,T')  is 
(T-iiTi)  for  the  nominal-by-ordinal  case  and  {TziT\)  for  the  ordinal-by-ordinal  case, 
since  the  nominal-by-nominal  case  is  the  most  general  among  three  cases.  Also,  for 
the  ordinal-by-ordinal  case  permitting  three-factor  intercation,  {T,T')  is  {Tq,T4). 
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For  the  third  principle  of  the  most  general  alternative  among  all  cases,  the  corre- 
sponding statistics  for  {T,T)  is  (72,  T4),  (T3,  Tij),  and  (Tg,?!}),  since  T4  corresponds 
to  the  most  general  alternative  among  all  cases.  For  the  fourth  principle  of  keeping 
the  nature  of  the  classification  variables,  the  corresponding  statistics  for  (T,T')  is 
(72,75),  (73,76)-  For  the  ordinal-by-ordinal  case  permitting  three-factor  interaction, 
r'  does  not  have  a potential  statistic  in  this  principle. 

Among  four  principles,  we  prefer  the  first  principle,  since  modified  P-values  can 
be  defined  for  most  cases  using  this  principle,  and  it  can  utilize  the  ordinality  of  clas- 
sification variables.  For  the  second  and  third  principles,  T'  does  not  consider  possible 
ordinality.  Table  3.2  summarizes  test  statistics  for  the  construction  of  ordinary  and 
modified  exact  P-values  for  testing  conditional  independence  in  / x T x K contingency 
tables  using  the  first  principle.  For  I x J x K contingency  tables,  the  discreteness 
will  not  be  severe  when  the  sample  size  is  large.  But,  when  the  sample  size  is  small, 
the  modified  P-value  can  reduce  the  conservativeness.  We  discuss  implementation  of 
the  exact  tests  in  the  next  section. 


Table  3.1.  All  possible  statistics  for  T'  for  six  tests. 
T V 


Assuming  no 
three-factor  interaction 

Nominal-by-Nominal  Tj 
Nominal-by-Ordinal 
Ordinal-by-Ordinal  T3 

Permitting 

three-factor  interaction 

Nominal-by-Nominal  T4 
Nominal-by-Ordinal 
Ordinal-by-Ordinal  Te 


Fi  T-2  T'  Ta  T4  Ts  T'  Te 


• • • • T4  • • • 

7^1  • • ■ T4  n ■ • 

Ti  T-2  T'  ■ T4  Ts  Ti  Te 


T4  ■ ■ 

T4  T5  T^ 
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Table  3.2.  ^Test  statistics  for  the  construction  of  the  ordinary  and  modified  exact 
P-values  P for  testing  conditional  independence  \n  I x J x K contingency  tables. 


Ordinary  Modified 
P-value  P-value  P* 


T 

{T,r) 

Assuming  no 
three-factor  interaction 

Nominal-by-Nominal 

Ti 

(r„r,) 

Nominal-by-Ordinal 

T2 

(T>,T,) 

Ordinal-by-Ordinal 

Tz 

{Ts,T2) 

Permitting 

three-factor  interaction 

Nominal-by-Nominal 

T4 

iTi,P(Z)) 

Nominal-by-Ordinal 

T, 

{Ts,T,) 

Ordinal-by-Ordinal 

Te 

3.5  Approximation  of  Exact  P-values 


For  three-way  contingency  tables,  algorithms  for  testing  conditional  independence 
are  available  in  widely-available  software  only  for  the  2 x J x K case  with  ordered 
columns  (StatXact  1991).  Even  for  table  sizes  where  software  exists,  the  reference  set 
of  tables  for  the  conditional  distribution  is  sometimes  too  large  for  an  exact  P-value 
computation.  For  instance,  sometimes  the  sample  size  is  moderately  large  but  there 
are  many  cells  and  the  table  is  sparse,  so  exact  methods  are  infeasible  but  the  use  of 
standard  asymptotic  theory  is  questionable. 

In  some  cases,  one  can  obtain  a very  accurate  approximation  to  the  distribution 
of  the  test  statistic  using  a saddlepoint  approximation.  This  higher-order  asymp- 
totic approximation  is  more  accurate  than  the  normal  approximation  or  the  one-  or 
two-term  Edgeworth  expansion.  It  is  applicable  to  conditional  densities  and  tail  prob- 
abilities of  sufficient  statistics  in  exponential  families.  For  example,  to  approximate 
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conditional  tail  probabilities,  one  can  use  an  approximation  due  to  Skovgaard  (1987). 
Davison  (1988)  applied  the  approximation  to  model  (3.5)  for  2 x 2 x K tables,  and 
Pierce  and  Peters  (1992)  applied  it  to  model  (3.5)  for  K = 1. 

To  illustrate  the  saddlepoint  approximation,  we  show  how  to  apply  it  to  the  ho- 
mogeneous linear-by-linear  association  model  (3.5)  for  arbitrary  K . Let  denote  the 
ML  estimate  of  ^ in  that  model.  Let  G^{I)  and  G^{L  x L)  denote  the  likelihood- 
ratio  statistics  for  testing  the  goodness  of  fit  of  the  conditional  independence  and 
homogeneous  linear-by-linear  association  models.  The  conditional  P-value  for  testing 
Ho  : fj  = 0 against  : ^ > 0 has  saddlepoint  approximation 


Hr(T  > to|{n,+fc},  {n+^;t})  ~ 1 - $(2)  -f  <f>(z)(-  - -),  (3.20) 

w z 

where 


2 = sgn{^)yjG^{I)  - G^{L  X L)  and 


w = 


2 


smh(-) 


|//|  ■ 


The  matrices  Ij  and  are  the  observed  information  matrices  for  the  conditional 
independence  model  and  homogeneous  linear-by-linear  association  model,  and  $ and 
4>  denote  the  standard  normal  cdf  and  pdf. 


Since  software  is  not  yet  available  in  the  generality  needed  for  the  exact  conditional 
methods  we  have  described  for  7 x J x A'  tables,  we  next  present  an  alternative  method 
that  can  approximate  the  exact  conditional  result  as  well  as  needed.  This  is  the  simple 
approach  of  performing  a Monte  Carlo  simulation  on  the  conditional  set.  The  Monte 
Carlo  method  is  an  alternative  to  computing  either  the  exact  or  asymptotic  P-values. 
It  is  useful  for  those  situations  where  the  data  set  is  too  large  for  an  exact  P-value 
computation  or  too  sparse  to  rely  on  the  asymptotic  theory. 

Agresti  et  al.  (1979)  utilized  this  method  effectively  for  a variety  of  tests  for  two- 
way  tables.  Even  for  large  tables  or  large  sample  sizes,  one  can  quickly  approximate 
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as  closely  as  needed  the  ordinary  and  modified  exact  P-values  for  the  six  statistics 
presented  in  Section  2 and  Section  3.  The  method  consists  of  sampling  contingency 
tables  from  the  conditional  reference  set  in  proportion  to  their  probabilities,  and 
computing  an  unbiased  point  estimate  and  a narrow  confidence  interval  for  an  exact 
P-value.  We  constructed  an  algorithm  to  perform  precise  approximations  for  the 
exact  inferences  using  a table-generation  procedure  suggested  by  Patefield  (1981). 
For  practical  applications,  we  prefer  this  approximation  to  the  saddlepoint  because 
it  is  available  more  generally  {e.g.,  for  multi-degree-of-freedom  statistics  for  testing 
vectors  of  parameters)  because  its  accuracy  is  known  to  the  user,  and  because  that 
accuracy  can  be  set  as  finely  as  one  requires. 

We  proposed  ordinary  and  modified  exact  P-values  for  six  tests,  and  T and  T' 
are  defined  in  Table  3.2.  To  illustrate,  suppose  we  want  to  estimate  a modified  exact 
one-sided  P-value  when  X and  Y are  ordinal  assuming  no  three-factor  interaction. 
Then,  we  test  against  a narrower  alternative  of  the  homogeneous  linear-by-linear 
association  model  (3.5).  The  secondary  statistic  T'  is  a test  statistic  directed  toward 
a broader  alternative  hypothesis.  For  T\  one  possibility  is  the  score  statistic  for  the 
case  of  nominal-ordinal  association  assuming  no  three-factor  interaction.  Let  be 
the  observed  value  of  V . Therefore,  in  this  case  we  have  T = and  T' 

is  a score  statistic  discussed  in  Section  3.2.3.  This  is  a one-sided  test.  Accordingly, 
modified  exact  P-values  for  other  tests  can  be  constructed  by  using  T and  T'  in  Table 
3.2.  They  are  two-sided  tests. 

To  implement  the  exact  tests,  we  sample  M contingency  tables,  with  replacement, 
from  the  reference  set  P of  tables  with  the  same  margins,  where  M is  chosen  to  give 
the  desired  degiee  of  accuracy  with  some  fixed  probability.  Define  the  upper  critical 
region  of  the  reference  set  by 


r*  = {Z  e F : T > or  (T  = and  T'  > Q }. 
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The  other  possibility  for  T'  is  to  use  the  null  table  probability.  Under  the  null  hy- 
pothesis of  conditional  independence,  the  probability  of  observing  any  specific  Z € F 
is 


Pr{z  = z)  = n 


K 

k=l 


! rij  ! 


(3.21) 


Then  we  define  the  critical  region  of  the  reference  set  by 


fp  - {Z  e r : r > to  or  {T  = to  and  P{Z)  < P{N))  }. 

For  the  z'th  table  sampled,  let  ?/,  = 1 if  2,-  6 T*,  and  let  y,-  = 0,  otherwise.  The  point 
estimate  of  the  modified  P-value  is 


the  proportion  of  sampled  tables  in  F*.  Likewise,  the  estimate  of  the  modified  P-value 
using  the  null  table  probability  for  T'  can  be  defined  using  F*,  and  we  denote  by  p*. 

For  the  estimate  of  ordinary  exact  P-value,  the  upper  critical  region  of  the  reference 
set,  F',  is 

F'  = {Z  GF:T>fo}, 

that  is,  the  proportion  of  sampled  tables  that  have  a test  statistic  at  least  as  large  as 
the  observed  one. 


3.6  Examples 


3.6.1  Example  1 

We  illustrate  the  exact  tests  using  Table  3.3.  This  is  a cross  classification  of  job 
satisfaction  by  income,  controlling  for  gender,  for  black  Americans  sampled  in  the 
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General  Social  Survey  of  1991.  In  order  to  utilize  ordinality  in  studying  the  par- 
tial association  between  income  and  satisfaction,  we  test  conditional  independence 
against  the  model  (3.5)  of  homogeneous  linear-by-linear  association.  Using  equally- 
spaced  row  and  column  scores,  the  likelihood-ratio  chi-squared  statistic  for  testing  the 
fit  of  that  model  equals  12.33,  with  df  = 17.  The  estimated  association  parameter 
is  /3  = 0.388  with  s.e.  = 0.155.  The  likelihood-ratio  chi-squared  statistic  for  test- 
ing conditional  independence,  assuming  the  model,  is  19.37-12.33=7.04  with  df  = 1. 
There  seems  to  be  very  strong  evidence  of  a positive  association  between  income  and 
satisfaction.  However,  the  data  are  sparse  enough  to  make  large-sample  approxima- 
tions questionable;  yet  the  sample  size  is  sufficiently  large  so  that  exact  analyses  are 
infeasible.  We  used  Monte  Carlo  sampling  with  M = 50,  000,  which  guarantees  that 
P-value  estimators  fall  within  0.004  of  the  true  P-value  with  probability  at  least  0.95. 

For  the  exact  tests  assuming  no  three-factor  interaction,  the  estimated  exact  P- 
values  for  the  ordinary  exact  P-values  (with  95%  precision  indicated  in  parentheses) 
are  0.332  (±  0.004)  for  the  nominal-by-nominal  test,  0.024  (±  0.001)  for  the  nominal- 
by-ordinal  test,  and  0.006  (±  0.001)  for  the  ordinal-by-ordinal  test.  Using  T'  defined 
in  Table  3.2,  the  corresponding  estimated  exact  P-values  for  modified  exact  P-values 
P*  are  0.332,  0.024,  and  0.004.  Also  using  the  null  table  probability  for  T',  the 
corresponding  estimated  modified  P-values  Pf  are  0.332,  0.024,  and  0.005.  The  dis- 
tribution of  T takes  121  separate  points  for  the  ordinal-by-ordinal  test,  and  since 
the  degree  of  discreteness  is  not  severe,  the  two  types  of  P-values  are  essentially  the 
same.  The  asymptotic  P-values  are  0.335,  0.026,  and  0.005,  respectively.  In  this  case, 
first-order  asymptotic  approximations  work  quite  well. 

For  other  exact  tests  permitting  three-factor  interaction,  the  estimated  exact  P- 
values  for  the  ordinary  exact  P-values  are  0.281  for  the  nominal-by-nominal  test, 
0.089  for  the  nominal-by-ordinal  test,  and  0.020  for  the  ordinal-by-ordinal  test.  The 
corresponding  estimated  P-values  for  modified  exact  P-value,  P*  or  P*,  are  0.281, 
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0.089,  and  0.020.  Also,  the  corresponding  asymptotic  P-values  are  0.277,  0.089,  and 
0.020.  Table  3.4  summarizes  results  for  all  six  tests  we  have  discussed.  Note  that  we 
would  not  obtain  strong  evidence  of  association  if  we  ignored  the  ordinality  of  the 
variables.  For  large  n,  since  the  discreteness  is  not  severe,  the  modified  approach  is 
not  needed.  Generally,  the  modified  P-value  is  less  discrete  than  the  ordinary  P-value 
and  leads  to  less  conservative  tests.  For  small  n,  we  can  see  the  advantage  of  using 
the  modified  approach. 


Table  3.3.  Cross-classification  of  job  satisfaction  with  income,  controlling  for  gender, 
for  black  Americans. ’ 


Gender 

Income 

Satisfaction 

VD 

LS 

MS 

VS 

Male 

< 5000 

1 

1 

2 

1 

< 15000 

0 

3 

5 

1 

< 25000 

0 

0 

7 

3 

> 25000 

0 

1 

9 

6 

Female 

< 5000 

1 

3 

11 

2 

< 15000 

2 

3 

17 

3 

< 25000 

0 

1 

8 

5 

> 25000 

0 

2 

4 

2 

Source:  General  Social  Surveys  (1991) 

VD  : Very  Dissatisfied,  LS  : A little  Satisfied 
MS  : Moderately  Satisfied,  VS  : Very  Satisfied 


3.6.2  Example  2 


We  next  illustrate  the  exact  tests  of  independence  using  Table  3.5,  which  is  a 3 x 2 
table  from  the  example  in  Table  1 of  Patefield  (1982).  This  is  the  results  of  a double- 
blind study  concerning  the  use  of  Oxprenolol  in  the  treatment  of  examination  stress. 
Among  32  students,  15  were  treated  with  Oxprenolol  and  17  were  given  Diazepam 
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Table  3.4.  Estimated  exact  P-values  for  testing  conditional  independence  in  Table 


Ordinary 
P- value 

Modified 
P-value  P* 

Modified 
P-value  P* 

Asymptotic 

P-value 

Assuming  no 
three-factor  interaction 

Nominal-by-Nominal 

0.332 

0.332 

0.332 

0.335 

Nominal-by-Ordinal 

0.024 

0.024 

0.024 

0.026 

Ordinal-by-Ordinal 

0.006 

0.004 

0.005 

0.005 

Permitting 

three-factor  interaction 

N om  i nal  - by-  N omi  nal 

0.281 

0.281 

0.281 

0.277 

Nominal-by-Ordinal 

0.089 

0.089 

0.089 

0.089 

Ordinal-by-Ordinal 

0.020 

0.020 

0.020 

0.021 

(control).  The  examination  results  were  compared  with  their  tutor’s  prediction.  The 
column  classification  is  ordinal,  and  the  row  classification  can  be  assumed  as  ordinal 
since  it  has  two  levels. 

When  X and  Y are  ordinal,  a relevant  model  that  reflects  the  ordinality  in  a 
two-way  table  is  the  model  of  linear-by-linear  association, 

log  = //  + Af  + Aj  + ^UiVj.  (3.22) 

The  independence  model  is  the  special  case  of  /?  = 0.  We  test  independence  against 
the  model  of  linear-by-linear  association  in  order  to  utilize  ordinality.  For  unit-spaced 
scores,  the  likelihood-ratio  chi-squared  statistic  for  testing  the  fit  of  that  model  equals 
2.64,  with  df  = 1.  The  estimated  association  parameter  \s  ^ = 1.706  with  s.e.  = 
0.773.  The  likelihood-ratio  chi-squared  statistic  for  testing  independence,  assuming 
the  model,  is  9.38-2.64=6.74  with  df  = I (P=0.009).  There  seems  to  be  very  strong 
evidence  that  the  examination  grades  compared  with  their  tutor’s  prediction  tend 
to  be  higher  in  the  treatment  group.  Large-sample  approximations  are  questionable 
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since  the  sample  size  is  small.  We  use  Monte  Carlo  sampling  with  M = 50,000  and 
compare  the  estimated  exact  P-value  with  the  the  exact  P-value. 

For  the  exact  tests  of  independence,  the  estimated  exact  P-values  for  the  ordinary 
exact  P-values  (with  95%  precision  indicated  in  parentheses)  are  0.026  (±  0.001)  for 
the  nominal-by-nominal  test,  0.024  (±  0.001)  for  the  nominal-by-ordinal  test,  and 
0.013  (±  0.001)  for  the  ordinal-by-ordinal  test.  The  corresponding  estimated  exact 
P-values  for  modified  exact  P-values  P*  are  0.026,  0.017,  and  0.013.  The  asymptotic 
P-values  are  0.028,  0.015,  and  0.007,  respectively.  The  ordinary  exact  P-value  for  the 
ordinal-by-ordinal  test  is  0.013.  For  a / x J table  with  ordinal  variables,  StatXact 
gives  ordinary  exact  P-values,  based  on  methodology  in  Agresti  et  al.  1990.  Table  3.6 
summarizes  results  for  the  tests  we  have  discussed.  Note  that  utilizing  the  ordinality 
provides  very  strong  evidence  of  association.  Also,  the  modified  P-value  can  give 
sharper  inference  for  small  n. 


Table  3.5.  Examination  results  compared  with  tutor’s  predictions. 

Group Results 

Better  Same  Worse 
Treated  582 

Control  0 11  6 

Source:  Patefield  (1982) 

Better  : Better  than  predicted 
Same  : Same  as  predicted 
Worse  : Worse  than  predicted 
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Ordinary 

P-value 

Modified 
P-value  P* 

Asymptotic 

P-value 

Nominal-by-Nominal 

0.026 

0.026 

0.028 

Nominal-by-Ordinal 

0.024 

0.017 

0.015 

Ordinal-by-Ordinal 

0.013 

0.013 

0.007 

3.7  FORTRAN  Program  for  Simulation 


Patefield  (1981)  provided  a subroutine  for  generating  two-way  random  tables  with 
fixed  row  and  column  totals.  We  can  apply  his  algorithm  stratum  by  stratum  in  order 
to  construct  three-way  random  contingency  tables.  We  utilize  the  six  exact  tests  for 
testing  conditional  independence  in  7 x J x K contingency  tables  that  were  discussed 
in  Section  2 and  Section  3.  These  test  statistics  are  score  statistics  for  loglinear 
models,  and  they  do  not  require  fitting  the  model.  The  computations,  which  involve 
simulating  exact  conditional  distributions,  are  considerably  simpler  when  one  can  use 
test  statistics  that  do  not  require  fitting  the  model  for  each  table  generated  for  the 
simulations. 

Boyett  (1979)  also  constructed  a subroutine  that  generates  two-way  random  tables 
from  the  exact  distribution  with  given  row  and  column  totals.  Patefield’s  (1981) 
subroutine  is  faster  for  larger  values  of  n,  and  it  can  calculate  the  probability  of  each 
generated  random  table. 

By  the  Monte  Carlo  sampling  of  tables  in  the  reference  set,  we  can  approximate 
exact  inference  with  simulated  exact  and  modified  exact  P-values  for  testing  condi- 
tional independence.  By  resampling  these  random  contingency  tables,  the  P-value  is 
updated.  The  FORTRAN  program  runs  interactively.  For  computational  accuracy, 
double  precision  is  used.  This  program  is  designed  for  IBM-compatible  PCs  or  UNIX 
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workstations,  and  the  general  structure  of  the  program  and  part  of  FORTRAN  source 
code  are  listed  in  Appendix  B. 


3.7.1  Restrictions 


Two-way  random  tables  must  have  at  least  two  rows  and  two  columns,  and  row 
and  column  totals  should  be  positive.  The  maximum  number  of  rows  and  columns  is 
50,  and  maximum  number  of  strata  is  20.  The  number  {N ROW  - 1)  x {NCOL  - 1) 
should  be  less  than  250.  This  is  the  maximum  array  for  the  variance-covariance 
matrix  in  the  nominal-by-nominal  test. 

Recursive  calculation  of  log-factorial  through  log(n-fl)!  = log(n)!  + log(n  + 1)  has 
the  disadvantage  of  accumulating  a large  rounding  error  (Verbeek  and  Kroonerberg 
1985).  For  accuracy,  double  precision  is  used  for  the  log- factorial,  and  the  log-factorial 
can  be  computed  up  to  25000. 


CHAPTER  4 

IMPROVED  EXACT  TESTS  EOR  ORDINAL  VARIABLES  IN  / x J x A'  TABLES 


4.1  Introduction 


Consider  contingency  tables  under  the  full  multinomial  model  where  row  and 
column  classifications  are  ordinal.  In  two-way  contingency  tables  when  both  classi- 
fications are  ordinal,  the  null  hypothesis  of  independence  can  be  tested  against  the 
alternative  that  utilizes  local  log  odds  ratios.  Many  tests  for  measuring  ordinal  as- 
sociation have  been  proposed.  We  can  utilize  tests  based  on  C — D,  the  number 
of  concordant  pairs  minus  the  number  of  discordant  pairs,  or  based  on  the  gamma 
statistic.  Both  are  discussed  in  Agresti  (1990).  Also,  log-linear  models  with  ordered 
categories  are  discussed.  Agresti,  Mehta,  and  Patel  (1990)  provide  an  algorithm  that 
permits  exact  tests  for  the  linear-by-linear  association  model  for  two-way  contingency 
tables  with  ordered  categories. 

If  an  exact  test  is  desired  with  size  being  equal  to  some  preassigned  value,  then 
randomization  would  be  required  on  some  tables  of  observed  frequencies.  This  is 
typical  of  any  discrete  problem.  We  want  the  resulting  test  to  be  admissible  even 
though  randomization  occurred.  Cohen  and  Sackrowitz  (1991)  proved  a theorem  that 
gives  the  class  of  exact,  unbiased,  and  admissible  tests.  Also,  Cohen  and  Sackrowitz 
(1992)  suggested  a procedure  for  an  exact  test  of  size  a,  and  a modified  P-value.  Such 
tests  are  performed  conditionally,  given  the  values  of  the  sufficient  statistics  for  the 
nuisance  parameters  under  the  null  hypothesis.  Hence,  the  critical  value  depends  on 
the  values  of  the  sufficient  statistics. 
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They  constructed  the  exact  test  of  size  cv  by  ordering  the  tables  according  to 
their  probabilities  on  sample  points  where  the  test  would  randomize.  They  made 
the  number  of  tables  on  which  randomization  would  occur  considerably  smaller  than 
in  the  usual  test.  We  could  use  another  test  statistic  directed  toward  a broader 
alternative  hypothesis  at  the  randomization  points,  utilizing  the  modified  approach 
discussed  in  Chapter  2. 

Cohen  and  Sackrowitz  (C-S)  focused  on  two-way  tables,  and  showed  unbiasedness 
of  tests  in  two-way  tables.  Eaton  (1970)  showed  the  essentially  complete  class  in 
an  exponential  family.  Eaton’s  theorem  shows  that  the  essentially  complete  class 
consists  of  tests  whose  acceptance  regions  are  convex  with  possible  randomization  on 
the  boundary  of  acceptance  region.  Furthermore,  Ledwina  (1978a,  1984)  gave  the 
class  of  admissible  rules  in  an  exponential  family.  Admissibility  of  tests  for  the  C-S 
theorem  is  obtained  using  the  arguments  in  Ledwina. 

We  focus  on  analyzing  three-way  tables.  The  problem  we  will  consider  is  testing 
conditional  independence,  assuming  that  the  model  of  no  three-factor  interaction 
holds.  We  first  introduce  theorems  and  lemmas  from  C-S  (1991),  and  then  generalize 
these  to  three-way  contingency  tables.  In  Section  2 we  state  the  theorem  of  C-S 
(1991)  as  well  as  related  lemmas,  that  give  the  class  of  unbiased  admissible  tests.  In 
Section  3 we  show  unbiasedness  of  tests  when  one  wishes  to  test  a null  hypothesis  of 
conditional  independence  against  the  alternative  of  no  three-factor  interaction  model 
in  three-way  contingency  tables.  Sections  4 and  5 present  the  complete  class  of  tests 
and  admissible  tests  in  an  exponential  family.  Using  these  arguments,  the  tests  of 
the  C-S  theorem  lie  in  a complete  and  admissible  class  when  we  consider  three-way 
tables  under  the  multinomial  model. 

Section  6 generalizes  to  the  three-way  case  some  results  of  Cohen  and  Sackrowitz 
(1991,  1992)  regarding  admissibility  of  tests  for  two-way  tables.  For  an  ordinal  alter- 
native, we  discuss  construction  of  tests  of  conditional  independence  that  are  exact. 
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unbiased,  and  admissible.  As  a special  case,  we  note  that  the  ordinary  randomized 
test  of  conditional  independence  for  2 x 2 x K tables  is  usually  inadmissible. 

Section  7 illustrates  the  exact,  unbiased  and  admissible  tests  with  examples.  We 
test  conditional  independence  in  2 x 2 x 5 tables.  Section  8 gives  some  comments. 

4.2  Basic  Results  in  Two-way  Contingency  Table 


Consider  testing  independence  against  the  alternative  that  all  local  log  odds  ratios 
are  nonnegative  with  at  least  one  local  log  odds  ratio  positive  for  a two-way  table.  We 
will  state  the  class  of  tests  that  are  simultaneously  exact,  unbiased,  and  admissible  in 
this  section.  We  need  definitions  and  lemmas  for  the  proof  of  unbiasedness  of  tests, 
obtained  by  Cohen  and  Sackrowitz  (1991).  We  will  extend  their  theorem  and  lemmas 
to  three-way  tables  in  the  next  section. 

Consider  an  / x J contingency  table  under  the  full  multinomial  model  where  each 
classification  is  ordinal  . Let  N = } be  the  7 x J two-way  contingency  table  of  cell 

frequencies,  and  let  tt  = {TTjj  } be  the  / x J matrix  of  corresponding  cell  probabilities, 
where  n = SSrijj,  and  SSTTjj  = 1.  Let  njq_  be  the  zth  row  total  of  cell  frequencies, 
* = I)’’’  and  n_).j  the  jth  column  total  of  cell  frequencies,  j = 

m = ({ujq.},  {ra_|_j}).  We  define  the  local  log  odds  ratios  as  = log  ^ _ 

l,---,7  — 1,  j = l,---,J— 1.  Our  testing  problem  can  be  expressed  as  testing  the 
null  hypothesis  Hq  : for  i = 1,  • • • , 7 - 1,  j = 1,  • • • , J - 1.  From 

Ledwina  (1984),  under  the  full  multinomial  model,  the  distribution  of  an  observed 
random  vector,  N,  of  the  cell  frequencies  (nn,  ■ ■ ■ ,n/j)  can  be  written  in  the  form 


f{N)  = d”n!n(j^,(n,,!)-i  exp(E.'jl-/-'n,,a,,  + + S/r^r+,d,),  (4.1) 


W-l.J-l 


99 


where 


a^j  — 

k = 

dj  — 

and 

d = (1  + + Se“'^+*’-+''^)-\ 

Note  that  (4. 1 ) is  the  density  of  multivariate  exponential  family,  and  a/t;  = 

Then  our  hypotheses  become  //q  : V’p  = 0,  i = 1, 1,  j = 1,  ■ • • , J - 1,  and 
Ha  '■  ’ki]  > 0,  with  strict  inequality  for  at  least  one  pair 

Also  let  Tjj  = T,  = (r,!,---  i = I,--  - ,/  — 1,  and  T = 

(Ti,  • ■ ■ , T/_i).  Attention  can  be  restricted  to  the  sufficient  statistics  (T,  m)  which 
have  the  joint  distribution 

f{t,  m)  = /3(V’,  6,  d)  exp(S,^~^S/"Vpdp  + E/-^ni+6,  + E/-^n+,d^)(7(t,  m).  (4.2) 

Note  that  {T,m)  is  a one-to-one  linear  transformation  from  the  space  N.  Let 
us  next  consider  the  structure  of  an  exact  test.  If  one  wishes  an  exact  test  such 
that  the  size  is  equal  to  nominal  value,  any  test  procedure  would  require  possible 
randomizations  on  some  points  of  the  distribution  of  test  statistic.  For  an  observed 
table  N,  a test  chooses  rejection  or  acceptance  with  certain  probabilities  that  depend 
on  N , denoted  by  <^{N)  and  1 — ip{N),  respectively.  A randomized  test  is  therefore 
completely  characterized  by  (/?,  the  critical  function,  with  0 < ^{N)  < 1 for  all  N. 
If  tp{N)  takes  on  only  the  values  1 and  0,  then  this  becomes  a nonrandomized  test. 
Let  ^p{N)  denote  an  exact  test  of  size  a depending  on  T and  m for  the  hypotheses 
concerning  the  distribution  of  N (or  the  joint  distribution  of  T and  m),  and  also 


log 


'KiJ'Kij 


1 T^iJ 

log 

7T/J 


log  — 

T^IJ 
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denote  the  conditional  test  as  a function  of  t,  for  each  fixed  m,  by  (pm(t).  If  the 
conditional  test  has  conditional  size  a,  then  the  size  of  the  original  test  can  be  obtained 
from  the  conditional  tests  by  taking  the  expectation  over  m,  which  is  = 

~ where  i/  refers  to  the  nuisance  parameters  for  m. 

By  Lehmann  (1986),  m is  sufficient  and  complete  under  the  null,  and  any  similar 
test  of  size  a:  must  have  Neyman  structure.  Hence,  the  test  for  each  fixed  ?n,  (prnit), 
must  have  conditional  size  a,  i.e.,  E^_Q[ipffi[t)\m]  = a for  all  m.  Accordingly,  if 

<f{N)  is  size  a,  then  is  of  size  a for  each  fixed  m.  The  conditional  distribution 

of  T can  be  obtained  by  fixing  m,  which  is  free  from  nuisance  parameters,  and  the 
test  ipmit)  is  done  conditionally  given  the  values  of  the  sufficient  statistics,  m,  for 
the  nuisance  parameters  under  the  null.  Hence,  the  critical  values  depend  on  these 
values. 

We  want  to  establish  conditions  under  which  the  overall  test  is  unbiased  and 
admissible.  Suppose  for  each  m,  is  monotone  nondecreasing  in  t.  This  means 

that  when  all  elements  of  t are  fixed  except  for  any  one,  is  nondecreasing  in 

that  variable.  Next,  we  let  for  each  fixed  m,  = {f  : iprn{t)  < 1}.  Hence, 

acceptance  region  of  the  test,  except  for  possible  randomization.  A 
point  a G A is  called  an  extreme  point  if  a is  not  an  interior  point  of  any  line  segment 
m A.  Cohen  and  Sackrowitz  (1991)  gave  the  class  of  tests  that  are  simultaneously 
exact,  unbiased,  and  admissible. 

Theorem  4.2.1  For  each  fixed  m,  if  is  monotone  nondecreasing  in  f,  then  the 

test  is  conditionally  unbiased  and  the  original  test  (^(iV)  is  unconditionally 

unbiased.  Furthermore,  the  test  (p{N)  is  admissible  if  and  only  if  for  each  fixed  m, 
convex  and  is  zero  at  nonextreme  points  of  A^^. 
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Hence,  an  exact  test  is  unbiased  and  admissible  if  and  only  if  conditionally,  given 
the  acceptance  regions  are  monotone  (in  the  sense  that  the  corresponding 
is  monotone)  and  convex  with  randomization  possible  only  at  extreme  points. 

The  following  definitions  and  lemmas  are  used  for  the  proof  of  the  unbiasedness 
of  tests  in  Theorem  4.2.1.  Let  a;  be  a A;  x 1 vector  lying  in  x X2  x ■ ■ ■ x Xk, 

where  Xi  is  a totally  ordered  subset  of  . Let  Hk  denote  the  family  of  nondecreasing 

functions  on  X’^.  Let  G and  f{x)  be  a nonnegative  function  defined  on  X'^ 
satisfying 

fix  V y)f{x  Ay)>  /(®)/(y),  (4.3) 

where  V and  A are  the  corresponding  lattice  operations  on  i.e.,  for  a;  = (xi,---  ,Xk), 
y = ivu---  ,vk) 

xy  y = (max(xi,?/i),max(x2,2/2),---  , max(a;;t, 

and 

X Ay  = (min(a:i,yi),min(x2,?/2),---  ,mm{xk,yk)). 

From  Karlin  and  Rinott  (1980)  we  have  the  following  definition. 

Definition  4.2.1  A function  with  the  property  (4.3)  is  said  to  be  multivariate  totally 
positive  of  order  2 (MTP2)  on  XK  Also  a A;  x 1 random  vector  [/  = (f/j,  • ■ • , Uk)  is 
MTP2  if  its  density  is  MTP2  . 

The  multivariate  total  positivity  is  defined  in  terms  of  ordering  on  a lattice.  Karlin 
and  Rinott  showed  that  if  f{x)  and  g{x)  are  MTP2  on  X\  then  f{x)gix)  is  MTP2 
on  X^  Also,  if  fix)  = gix„x^),  where  g is  TP2  on  X,  x X^,  then  / is  MTP2  on 
X*^.  Hence,  products  of  such  functions  are  MTP2  on  X^.  With  connection  to  MTP2 
density,  Fortuin,  Ginibre,  and  Kasteleyn  (1971)  stated  the  following  inequality,  which 
we  denote  by  FGK.  Let  C7  be  a random  vector  whose  density  is  MTP2,  with  respect 
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to  a product  measure  defined  on  a product  set.  Let  LLi,  IV2  be  functions  of  U lying 
in  Hk.  Then 


Now  let  u <v  mean  u,  < Vi,  ^ ,k.  From  Marshall  and  Olkin  (1970)  we 

have  the  following  definition. 

Definition  4.2.2  A random  vector  U is  said  to  be  stochastically  less  than  or  equal  to 
a random  vector  V if 


for  all  h G Hk  for  which  expectations  exist. 

These  definitions  and  inequality  were  incorporated  in  the  following  lemmas.  These 
lemmas  were  provided  by  Cohen  and  Sackrowitz  (1991),  and  applied  to  show  the 
unbiasedness  of  tests. 

Lemma  4.2.1  Assume  Hq  is  true.  Also  assume,  conditional  on  m = ({n,q.},  {714.^}), 


E{W^,W2)  > EW,EW2. 


(4.4) 


Eh{U)  Eh{V), 


(4.5) 


7=1, •••,7-1,  j = l,...,J-i, 


Ti  is  MTP2 


(4.6) 


TjITi,  • • • ,r,_i  is  MTP2  for  all  z = 2,3,  • • ■ ,7  — 1 


(4.7) 


if  Tj  < T'  for  j = I,--  - ,i-  1. 


(4.8) 


E{W{T)W*{T)\m}  > E{W{T)\m}E{W*{T)\m}. 


(4.9) 
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The  next  lemmas  are  three  conditions  assumed  for  the  previous  lemma,  and  the  proofs 
are  given  by  Cohen  and  Sackrowitz  (1991). 

Lemma  4.2.2  Under  Hq^Ti  given  m is  MTP-i- 

Lemma  4.2.3  Under  //q,  T,|Ti,  • • • , T,_i,  m is  MTP^  for  alH  = 2,  • • • , / - 1. 

Lemma  4.2.4  Under //o,  T,|Ti,  • , T,_i,  m <p  T,|T;,  • • • 
for  alH  = 2,  3,  • • • , 7 - 1 if  T,  < r'  for  j = 1,  • • • , i - 1. 

Cohen  and  Sackrowitz  proved  the  unbiasedness  portion  of  Theorem  4.2.1  for  two- 
way  tables.  These  lemmas  and  inequality  are  the  main  tools  for  the  proof.  Now,  we 
want  to  display  a test  statistic  for  two-way  tables  and  show  that  tests  based  on  it 
have  a desirable  monotonicity  property. 

For  a two-way  contingency  table  where  each  classification  is  ordered,  the  statistic 
to  reflect  the  association  between  two  ordinal  variables  is 

P 'P‘'P‘UiV jTlij  ^ (4.10) 

where  Uj’s  and  Uj’s  are  monotone  scores  to  display  category  ordering.  This  statistic 
is  studied  in  Agresti  (1990,  1992).  If  u,  = (/  - (i  - 1))  and  = (J  - (j  - 1)),  then 
it  becomes 

^ ^i=i  tij  ■>  (4-11) 

where  Cj  = Next,  we  show  that  tests  based  on  T = have  a 

desirable  monotonicity  property;  hence,  they  are  unbiased.  We  note  that  by  the 
definition  of  monotonicity  (all  are  fixed,  except  one)  for  any  7,  J the  statistic 
T = ESCj  is  monotone  in  i = 1,  ■ • ■ ,7  — 1,  j = 1,  • • • , J — 1.  Hence,  the  test 
based  on  T is  monotone  and  then  unbiased,  since  it  satisfies  the  condition  of  Theorem 
4.2.1.  This  is  the  test  statistic  that  Cohen  and  Sackrowitz  used  in  two-way  tables 
with  ordinal  alternative. 
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For  the  admissibility  portion,  Ledwina’s  theorem  will  be  applied  and  stated  in 
Section  4.o.  The  unbiased  and  admissible  tests  will  be  explained  in  Section  4.6,  and 
we  will  show  that  the  tests  satisfy  the  properties  of  Theorem  4.2.1. 


— Unbiasedness  of  Tests  in  Three-way  Contingency  Tables 


In  this  section,  we  will  generalize  Theorem  4.2.1  for  testing  conditional  indepen- 
dence in  three-way  contingency  tables.  The  unbiasedness  portion  of  the  tests  in 
Theorem  4.2.1  considering  three-way  tables  is  proved  with  lemmas,  and  we  utilize 
the  definitions  and  lemmas  stated  in  Section  4.2.  For  the  admissibility  part,  we  will 
apply  the  theorems  in  Ledwina  (1978a,  1984)  and  Matthes  and  Truax  (1967)  for  ex- 
ponential families,  which  will  be  stated  in  the  next  sections.  Showing  unbiasedness 
of  tests  is  the  main  part  for  proving  Theorem  4.2.1  in  three-way  tables,  and  we  will 
follow  the  arguments  in  Ledwina  for  the  admissibility  of  the  tests.  Then  we  have  the 
exact,  unbiased,  and  admissible  tests  in  three-way  tables. 


4.3.1  Conditional  Independence  Model 


We  will  specify  the  general  multinomial  model  in  three-way  tables,  and  state  the 
testing  problem  under  the  null  hypothesis  of  conditional  independence.  We  will  prove 
unbiasedness  of  tests  and  related  lemmas  focusing  on  three-way  tables.  Consider  an 
I X J X K contingency  table  under  the  multinomial  model,  where  each  row  and  column 
classification  is  ordinal.  Let  N = denote  observed  cell  counts,  with  expected 

frequencies  Let  tt  = be  probabilities  for  a multinomial  distribution 
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over  I xj  X K cells,  where  n = SSSui,^.,  and  SSSTTi^^  = 1.  From  Ledwina  (1978b), 
the  distribution  of  N can  be  written  as 


f{N) 


711 


I,J,K _ I ^tjk 


where 


^k  f^i+kCik  ^ ^ 


■' J— 1 vA'— 1 . 


A'-l 


(4.12) 


O^ijk 

— log 

T^iJkT^Ijk 

'^iJK'^IjK 

^ijK^IJK 

bij 

'K^jK'^IJK 

- log 

^iJK'^  IjK 

Cik 

1 T^UkT^UK 

= log 

T^iJKT^IJk 

djk 

- log 

'^IjK'^IJk 

^ T^iJK  r 

-log  , fj 

t^uk 

= log 

t^ijk 

1 ^ijk'^IJk  1 
log log 

^iJk^  Ijk 


, i'fc  ==  log 


T^IJk 

T^IJK 


T^ijKT^IJK 

'^iJK'^IjK 


and 


/ = +T,e^^  +T,e^^ ^j]Qei+fj+gk+b,j+cik+djk+a,jkyi 

Let 


’^ij{k) 


log 


'^ij  k'^i-\-l,j-{~l,k 


which  is  the  local  log  odds  ratios  in  A;th  stratum.  Note  that 


log 


'^Imk^IJk 

'^IJk'^Imk 
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Hence,  we  have 


^Imk  ^j=mi^ij{k)  ^ij{K))- 


Also  let  - T^in=il:]^-inimk-,  and  = (Tn(A:),  Ti2(k),  • • ■ , Ti(^j-i){k)),  ^ = l,•••,/- 
1 , and  T = . . . , , t\^\  • • • , , • • • , t[^\  ■ ■ ■ Then 


^k=l  [^L\^j=H'^ij{k)  - i’i](K))T^j(k)] 


~ ^k=\  \'^i=\^i=l^m=\'^]=\{i’ii(k)  — '4’ij(K))nimk\ 

= Sfc=l  ^/=1  ^m2\{'^i=l'^'j=m{'<l^ij(k)  — '4’ij(K)))nimk\ 
= ^k=\[^\=\'^'j=lo-ijknijk\-  (4-13) 


Let  r — ({wij+},  {n,_|_^.},  {n+jt}).  Then  using  (4.13)  we  rewrite  (4.12)  as 


f{N)  = /3(V^,6,c,d,e,/,fir)exp(Sf-^S/-'S/-i(V^,,(,)  - t/>,,(K))%fc) 

+ Sj  Sj  Yil  ^n^jkdjk 

+ S/^i‘«t++ei  + Ej~ln+j^fj  + E^S^^n++kgk)  ■ g(t,  r).  (4.14) 

Hence,  no  three-factor  interaction  has  the  following  equivalent  expressions,  for  all  z 
and  j: 


kkijk  — 0,  h — I,***  ,A  1 

^ij(k)  — i^ijiK):  ^'  = 1 , ’ ‘ , A — 1 

= ^ij(2)  = • • • = -lpij(K)  = ^ij- 


It  means  that  the  association  between  row  and  column  variables  is  identical  at  each 
level  of  stratum. 
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When  we  test  the  model  of  conditional  independence,  we  will  assume  that  the 
model  of  no  three-factor  interaction  holds.  Hence,  we  assume  that  for  all  i and 
ji  ii’ijik)  — '4’ij(K))  = 0,  ^'  = 1,  • • • , A — 1.  Let  Then 


m-\- 


Y'l—l  \yj—l  1 
^i=l  ^j=l 


(4.15) 


Let  "»=({".+»).  {"+jt}).  i=l,...,7-l,  t = I,...  ,/C.  Using 

(4.15),  we  rewrite  (4.14)  as 

f{N)  = /J(V-.c,d,e,/,s)exp(Ef-'E'-'S/-'(*,(K  - 

+ S'-'S/-Vy,mr„+  + + E/-'Ef 


+ E-^1  n,++e,  + -f  nj^^kQk)  ■ g{t,  m). 


Note  that  = ^LiT^uk)-  Then 


(4.16) 


nA'  — 1^7  — 1 


* i'^ij(k)  '^ij(K))Tij(k)  + i>ij{K)Ti 


o+ 


Efc  Ej  S^.  (il’ij{k)  ’’Pij{K))Tij(^k)  + ^E^  ^'0o(A')E^_jTjj(fc) 


= Ef-'E/-'E/-V,j(fc)T,,(fc)  + E^'E/-V. 


ij(K)Tij(K) 


— ^k=\'^i  ^E^  ^i>i3(k)Tij(k). 


(4.17) 
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Hence,  using  (4.17),  we  rewrite  (4.16)  as 

f{N)  = /5(V’,  c,  d,  e, /,flf)  exp(E^Sf  ^’^ij(k)Tij(k) 

+ '^Li^t++ei  + T,jIiTi+j+fj  + E^~^^n^^k9k)  • g{t,  m).  (4.18) 

From  (4.18),  we  see  that  we  may  treat  the  observation  as  and  the  parameters 
&s  The  problem  is  to  test  conditional  independence  under  the  assumption 

that  the  model  of  no  three-factor  interaction  holds.  Here  we  consider  the  problem 
under  the  simplifying  assumption  that  the  ipij{^k)  have  a common  tp  over  k,  so  that 
the  hypothesis  reduces  to  //q  : i/j  = 0,  when  the  '0’s  are  not  assumed  to  be  equal  but 

'^ij{k)  — 0si,  k — 1,  • • • , A for  i = l, •••,/  — 1,  j = ^ J — Therefore,  our 

hypotheses  become 

^0  : Aj{k)  = Aj{K)^nd^Pij^K)  = 0,  i = l,- ••,/-!,  j = 1, . . . , J _ i,  A;  = 1,  • ■ • , - 1 

44  0„(fc)=0  = O,  i = 1,...  ,/- 1,  j = 1^... 

Ha  : No  Three-Factor  Interaction  Model. 

The  test  is  carried  out  conditionally,  given  the  values  of  margins,  and  the  condi- 
tional joint  distribution  of  N given  m under  the  null  reduces  to  the  product  of  K 
hypergeometric  mass  functions,  which  is  the  table  probability  under  the  null. 


4.3.2  Unbiasedness  of  Tests 


In  order  to  prove  unbiasedness  of  tests  in  Theorem  4.2.1,  we  need  the  following 
lemma.  In  the  lemma,  three  conditions  are  assumed  and  they  will  be  verified  after 
proving  unbiasedness  in  Theorem  4.2.1.  This  test  is  done  by  conditioning  on  the 
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values  of  all  elements  in  the  margins,  m = ({ui+fc},  that  are  random,  so  that 

m the  conditional  model  these  margins  m are  fixed,  and  cell  counts  from  different 
strata  are  independent. 

Lemma  4.3.1  Assume  Hq  is  true.  Also  assume,  conditional  on  m, 

\)T^^^\sMTP2  for  all  A:  = l,---,/i  (4-19) 

ii\  12^(1)  y(l)  rp(2)  rp{2)  rp(K)  „(A')  . 

for  alM  = 2,3,---  ,/- 1,  k=  I,  ■■■,!<  (4.20) 

iii)  |tS'\  • ■ • , Tt\ , • • • , t\^}  <p  rf ) |T'S'\  • • • , , rf\  • • • , tS 

for  all  i = 2,3,---  ,/-  1,  ^ = I,--  - , A^, 

ifrf' < T'J^^for  j = I,--.  ,z- 1,  ^ = l,-.-,Afi  (4.21) 

Let  W{T)  = W{t[^\...  ,Ta,TS^...  ,T^),and 

fff^(T)  = lL^*(Tr),...  ,Ta,TS^...  ,ra,...  ,tS)  e i/(A,(;-i)(,_i),  where 
-LL(a')(/-i)(j-i)  denote  the  family  of  nondecreasing  functions  on  ^ 

Then  under  //q, 

E{W{T)W*{T)\m}  > E{W{T)\m}E{W*{T)\m}.  (4.22) 

Eioof.  We  suppress  m,  since  all  statements  are  conditional  on  m.  Now 
EW{T)W*{T)  = ^fT(TS'V--  ,tH) 

= £{£ir(rS'>,...  .T^)! 


> £{£(ir(rS'>,...  ,t<0) 


■■(I)  rri{2) 


(A-)- 


(4.23) 
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by  (4.20)  and  the  FGK  (Fortuin,  Ginibre,  and  Kasteleyn)  inequality.  The  expression 
(4.23)  can  be  written  as 


...  .TiLVd't---  ,tK)  (4.24) 


■■(b  ti(2) 


where 


W'.ItS’',  . . . , T>'_v  . . . , T\1l)  = EW{lf\ ....  ri^’i  |Ti", ....  rW ) 


41)  ^(2)  ...  rr{K) 


(1) 


and 


Note  that  (4.21)  implies  and  W*  G Therefore,  one  can  use  (4.20) 

again.  Hence, 


= E{£ir,(rl"....  .Ti‘_t,r«....  ,rS)vrr(ri"....  .T<'i.r«...  .rW)| 


■’(1)  'Ti(2) 


w- 


...  rp(^) 

1 ’ » 7-35  -*2  ) ■ ■ ■ 5 /-aJ / 

> ^{^(h/i(tS'),-..  ,tH|t!'V--  ,t^])^(vf;(tS'\...  ,tK|tS'V--  ,tK))} 


■’ll)  'T^(l) 


(4.2,5) 


by  letting 

H/2(Tf\-..  ,tS])  = ^1Ti(tS'V-- ,tK|T«,--- 

and 

h',*(tS'>....  ,rW)  = (4.26) 

The  process  can  be  repeated  until  we  have  that  (4.25)  is  greater  than  or  equal  to 

EW,_2{T[^\  • ■ • , T!^W/-2(rS'\  • • • , 


(4.27) 


Ill 


The  last  step  comes  from  (4.19)  and  FGK  inequality.  Also  by  the  definition  of  lT/_2, 


rp{2) 

» 1 5 


(4.28) 


Similarly  for  W*.  Using  (4.28)  on  the  right-hand  side  of  (4.27)  we  have 


F;{fU(T)lU*(T)|m}  = E{W{T^^\. 

rrW  rri{2) 

?-*/-!  5-^1  J' 

5 7-1  1 

rrii^)  rp(2) 

) /-I)  1 ) • • • 

> E{W{t[^\- 

nnW  rp(2) 

5 /-15  -t  1 5 • 

E{W*{t[^\ 

. . . rp[2) 

••• 

= £;{fU(T)|m}£;{lU*(T)|m}.  (4.29) 


Proof  of  Unbiasedness  in  Theorem  4.2.1 


Now  we  show  unbiasedness  of  tests  in  Theorem  4.2.1  in  three-way  tables.  Let 
/^(t|m),  T = (tS'\  • • • , T?_\,  tS''\  ■ • • , , t[^'\  • • • , denote  the  con- 

ditional density  of  T\m,  where  V’  lies  in  the  alternative  space  and  let  fQ{t\ni)  be 
the  conditional  density  under  the  null.  Using  (4.18)  we  derive  conditional  densities. 
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Then, 


W*{t) 


oc 


exp(SfE,^  Vij(fc)Tii(fc)). 


(4.30) 


Hence,  W*{t)  is  monotone  nondecreasing  in  T,  and  W*{t)  G Tf(A')(/-i)(j-i)  for  any 
V’  in  the  alternative  space.  Also,  by  the  assumption,  test  ipm{t)  G 
Consider  for  ■tjj  in  the  alternative  space. 


> [Sv^m(i)/o(^|m)][ElT*(i)/o(t|m)],  by  (4.22) 


= «•  (4.31) 

By  the  application  of  Lemma  4.3.1,  we  have  inequality.  Expression  (4.31)  implies  con- 
ditional unbiasedness  of  which  in  turn  implies  unbiasedness  of  the  original 

test  ip{N),  by  noting  that 

m(^)|»Ti)]  > tt,  where  u refers 

to  the  nuisance  parameters.  Hence,  we  finish  the  unbiasedness  portion  of  Theorem 

4.2.1. 


Proof  of  Lemmas 


Now  we  verify  (4.19),  (4.20),  and  (4.21),  which  are  conditions  assumed  for  Lemma 


4.3.1. 
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i)  Under  Hq,  given  m is  MTP2,  for  all  = 1,  • • • , K. 


Proof. 

Let  1 < A:  < A",  and  z = I,--  - ,/-  1,  j = 1,...  , J_  1. 

T\^^\{n^+k'}An+Jk'}  k' = I,--.  ,K 

^ {n+jfc},  {ni+;t/},  where  A:'  = 1,  • • • , A:  - 1,  A;  + 1,  • • • , A" 

{n+jA:}  since  {ui+*;/},  {n+j*,./} 

are  independent  of  pf^\  {rzi+;t},  {n+,k},  (4.32) 

which  is  MTP-2  by  Lemma  4.2.2. 


ii)  Under  Hi 


0, 


rri(k) 

“*■  i 

I-*  1 j 

. . . T^(^) 

> j-n  -*■  1 5 ■ ■ 

5 t-1  ) ■ ■ ■ 

rp{K) 

1-^  \ 5 ■ ■ ■ 

is  MTP2 

for  all  z = 

2,3,-..  ,/ 

-1,  A;  = l,-..  ,AL 

Proof. 

Let 

VI 

VI 

and  for  all  z = 2, 

•••  ,/-l, 

rri{h)  1 

i \ 

1 ? ■ 

41)  rp(2)  rp{2)  „(7C) 

I-l  5 -*•  1 > 5 -*•  i-\ 5 ■ ■ ■ 5 -I  1 

. . . 'T'(^) 

■>  5 i-1  ' 

, m 

44 

rp{k) 

-*•  i 

. . . 'T’W  'Tl(l) 

5 -*■  i-1 ) 1 ? ■ ■ 

’ -*■  1-1  ^ * 

rri(k—\) 
5-^1  5 ■ ■ 

_ Ji(fc-l)  ^(fc+l)  _ _ _ rp{K) 

’ 1 — 1 ’ 1 5 9 t — 1 5 

^ {rii+k},  {n+jk}  by  the  independence  of  the  strata, 


4^') 


which  is  MTP2  by  Lemma  4.2.3. 
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Hi)  Under  i/o,  given  m, 


'T’iP  T<('P 

5 i-1  J 1 ? 

■ ■ ■ 'X'{^  ^ <p  y(^) 

IT'!"," 

/ji/(i)  871/(2) 

’ s-l » 1 ? ■ ■ ■ 

rp/{K) 

5 t-1 

for  all  i 

= 2,3,- 

• ■ ,/  - 1,  A;  = 1, 

,/L(4.33) 

if  Tf  ^ < T'fhov 

2 = 1,- 

'’52  1,  k — 

•••  ,/L. 

Proof. 

Let  1 < A;  < 

K , and  i'  = 1,  ■ 

,/-l,  /=  1, 

•••  ,J- 

- 1. 

For  all  i = 2, 

3,--  - 1, 

/Tiik)  1/71(1) 
i 1 1 5 ■ ■ 

871(1)  rp{2) 

^ i-\  1 \ 1'  ' 

rr(K) 

^ \ - ■ , T\_[,  {n.,/+^.},  {n+j/fc},  by  the  independence  of  the  strata. 

Likewise, 

rri{k)  Irp/{1)  __  rpl{l)  rpf(2)  T./(A')  ^ 

44  T,  \T  [ \ {n^jik}^  by  the  independence  of  the  strata. 

Hence, 


Tf'iTS'),...  ,rS  <PTf'|T'<'>....  .r'W.rw  ...  ,t'W 

|rl  \...  . {’i+j't},  <'' {n;.+fc}, 

for  all  i = 2,  3,  •••,/-  1 

if  = 1^... 


which  is  proven  by  Lemma  4.2.4. 

Hence,  all  three  conditions  assumed  for  Lemma  4.3.1  are  established.  We  next 
present  the  complete  class  of  tests  and  admissible  tests  in  an  exponential  fainily. 
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4.4  Complete  Class  of  Tests 


We  show  that  the  tests  in  Theorem  4.2.1  lie  in  the  complete  class  of  tests.  From 
(4.18)  we  have 

fi^)  — c,  d,  e, /,^)  exp(S^E/  ^'^ij{k)Tij(^k) 

+ ^L\ni++ei  + E/~/ n+j+/j  + ■ g{t,  m).  (4.34) 

We  rewrite  (4.34)  in  the  following  family  of  distributions, 

P{T,Z;ip,w)  = C(rJj,w)  exp[ip' T + w'Z].  (4.35) 

That  is,  a random  vector  (T,Z)  e x has  an  exponential  density.  Let  0 
denote  the  natural  parameter  space,  and  assume  (0,0)  is  an  interior  point  of  0. 
Eaton  described  an  essentially  complete  class  of  tests,  and  we  need  the  following 
notation  to  formulate  Eaton’s  result. 

Let  V’  be  the  parameter  of  interest  and  w be  the  nuisance  parameters.  The 
problem  considered  is  that  of  testing  hypotheses. 

Ho  : ^ = 0, 

Ha  : V’  e C 

where  Hi  is  contained  in  some  half-space.  It  is  assumed  that  for  each  V’  G there 
exists  a.weR''  such  that  G 0. 

Let  V C il™  be  the  smallest  convex  cone  containing  fli,  and  let  V~  denote  the 
normal  cone  of  V,  e.g., 

V = {w  e i?”  : E”^^u,Uj  < 0 for  all  t;  G E},  rri  = df.  (4.36) 
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Moreover,  $ stands  for  the  class  of  nonempty  closed  convex  sets  in  and 

— {C*  : C*  G $ and  V~  (Z  C — c for  each  c G dC}^  (4-37) 

where  dC  stands  for  the  boundary  of  C . 

Consider  the  set,  D (C),  of  test  functions  with  the  following  property: 

if  V?  G D*{V),  there  exits  a measurable  set  A C x such  that  each  Z section, 
A{Z)  C R \ is  in  $(1/)  and 

f 1 if  T G A(Zf, 

^(t,z)  = r(t,z)  if  TedA(Z), 

[O  ifTGlntM(Z), 

where  A(Z)‘^  refers  to  the  complement  of  A(Z).  The  notation  A(Z)  refers  to  Z 
section  of  acceptance  region.  This  means  the  acceptance  region  at  fixed  Z ~ z 
when  we  consider  the  conditional  test.  Eaton  (1970)  showed  that  D*  is  an  essentially 
complete  class  for  testing  //q  : -0  = 0 against  //«  : 0 G fij.  In  light  of  (4.34),  the 
testing  problem  in  three-way  contingency  tables  fits  the  framework  of  Eaton,  which 
yields  the  fact  that  the  tests  in  Theorem  4.2.1  lie  in  the  complete  class  of  tests. 

4.5  Admissible  Tests 


Matthes  and  Truax  (1967)  described  the  class  of  admissible  tests  on  mnltivariate 
exponential  distributions  for  testing  //q  : 0 = 0 against  : 0 ^ 0,  based  on  the 
conditional  distribution  of  T given  Z.  This  description  is  given  under  the  assumption 
that  the  support  of  conditional  distribution  is  finite.  They  showed  that  a test  <p  is 
admissible  if  and  only  if  there  exists  a convex  acceptance  region,  say  A(Z),  equivalent 
to  (/?(•,  2),  such  that  2)  = 0 at  all  nonextreme  points  of  A{Z).  The  notation  A{Z) 
refers  to  Z section  of  acceptance  region,  the  same  as  in  Section  4.4.  Using  methods 
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developed  by  Matthes  and  Truax,  Ledwina  (1978a,  1984)  gave  admissibility  of  tests 
on  multivariate  exponential  distributions  with  discrete  support.  It  is  characterized 
by  the  fact  that  the  conditional  distribution  of  T given  Z = z is  independent  of 
the  nuisance  parameters  w.  Hence,  we  consider  the  admissibility  on  each  section 
of  Z = z separately,  and  then  obtain  the  class  of  admissible  tests  for  the  original 
problem.  The  class  of  admissible  tests  for  Hq  : ip  = 0 against  Ha  : ip  e fli  C R”^ 
in  (4.35),  based  on  the  conditional  distribution  of  T given  Z = z,  is  described  as 
follows.  A test  if{t)  is  admissible  if  and  only  if  there  exists  a set  A G $(!/)  in  (4.37) 
such  that  on  each  surface  of  Z = z,  A(Z)  C and 


where  E denotes  the  set  of  all  extreme  points  of  A.  This  means  that  a test  ^{t) 
is  admissible  if  and  only  if  for  each  fixed  z,  the  acceptance  region  is  convex,  and 
randomization  happens  only  at  extreme  points. 

Ledwina  (1984)  also  gave  connections  between  admissibility  of  tests  for  the  condi- 
tional distributions  and  the  initial  problem  of  tests  based  on  (4.35).  Ledwina  showed 
that  the  test  is  admissible  for  testing  Hq  against  //„  if  and  only  if  for  every 

fixed  Z = z,  the  test  <p(-,z)  is  admissible  in  the  class  of  tests  based  on  the  condi- 
tional distribution  of  T given  Z ~ z.  From  the  arguments  in  Ledwina,  the  tests  in 
Theorem  4.2.1  are  admissible.  Hence,  they  are  the  exact,  unbiased,  and  admissible 
tests  in  three-way  tables. 


' 1 

= < r(t) 

, 0 


if  Tg  A(Z)^, 
if  T G 

if  T G Int  A(Z), 
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iA Exact,  Unbiased  and  Admissible  Tests 


In  this  section  we  illustrate  the  exact,  unbiased,  and  admissible  tests  that  satisfy 
the  properties  of  Theorem  4.2.1.  We  discuss  how  to  construct  unbiased  tests  and 
how  to  set  up  critical  regions  to  obtain  tests  of  conditional  independence  of  fixed  size 
a,  for  the  ordinal  alternative.  We  focus  on  three-way  tables  where  row  and  column 
classifications  are  ordinal,  and  the  contents  of  Sections  3,  4 and  5 are  combined 
together  to  give  the  unbiased  and  admissible  tests.  One  advantage  of  ordinal  models 
over  the  nominal-scale  models  is  that  tests  based  on  ordinal  models  have  more  power 
to  detect  certain  types  of  association  and  interaction  (Agresti,  1990). 

The  model  of  homogeneous  linear-by-linear  association,  which  utilizes  the  ordi- 
nality  of  X and  Y is 

log  M + Af  + Aj  + Af  + (3uiVj  + + AJ/.  (4.38) 

We  test  conditional  independence,  Hq  : = 0,  or  equivalently,  /3  = 0,  against  the 

alternative  (4.38)  of  linear-by-linear  association,  using  the  sufficient  statistic  for  (i  in 
that  model, 

T (4.39) 

We  show  that  a test  based  on  T satisfies  the  conditions  of  Theorem  4.2.1,  so  it  is 
the  exact,  unbiased,  and  admissible  test.  First,  we  show  that  T can  be  expressed  as 
T — {ui  — u,+i)(u^  — Vj+i)t^j(^k)]  + C,  where  and 

(7  is  a constant  depending  on  the  scores  and  the  fixed  marginal  totals. 
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Let  U/+1  = = 0.  Then, 

k i 3 
K I J 

~ X^{(^t  ~ '^t  + l)  + ('lij-l-l  — '^8-1-2)  + • • • + {uj  — W/^l)}  • 

k i j 

{(wj  — Wj+i)  + (uj+i  — Vj+2)  + • • • + (nj  — Vj+i)}mjk] 

K I J I J 

k i J a=i  l,=j 

K I J I J 

= EEEEE(“.  - Ua+i){vb  - Vi,+3)nijk] 

k i j a=i  b=j 

A / J a 6 

= EEEEE(« 

k a=l  6—1  J— 1 j— 1 

^ I J a b 

= EEEK  - Ua+i){Vb  - V6+1)  ^ 

k a=l 6=1  j_j  j_j 

K I J 

= EEE(“.-  ^®+l)(^i  ^j+l  )L'2(A:)] 

k t=l  j=\ 

K l-\  J-\ 

= IZtl]  - Ui+i){Vj  - Vj+i)tij^k) 

k 2 = 1 j = l 


+ 


J-l  /_! 

(it;  - n/+i)  ^{vj  - + {vj  - nj+j)  - Ui+i)Lv(fc)  + uivjt]j(^k)] 

i=l 


K /-I  J-l 

= EEE  (uj-  Ui^i){vj  + (7. 

k 2=1  j=l 


Thus,  T is  monotone  in  {tij^k)}  if  the  scores  satisfy 


(n,  - Ui+i){vj  - n^+i)  > 0, 


(4.40) 
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for  z 1,  • • • , / 1,  j __  ^ J that  is,  if  the  scores  {u,},  {r;j}  are  both  monotone 

increasing  or  both  monotone  decreasing.  We  note  that  the  statistic 
is  a special  case  of  T,  for  the  equally-spaced  scores  {m  = I — (i  — 1)},  and  {vj  = 
J ~ U ~ !)}•  Thus,  tests  based  on  T are  unbiased. 

In  constructing  critical  regions,  we  utilize  a secondary  statistic,  T',  for  ordering 
the  tables  for  which  T = tg.  The  secondary  statistic  is  used  to  generate  a secondary 
partitioning  to  set  up  critical  regions  to  obtain  tests  of  conditional  independence  of 
fixed  size  a.  When  / = J = 2,  we  could  use  T'  = to  order  the  tables  for  which 

T = tg.  The  approach  of  Cohen  and  Sackrowitz  (1992)  is  to  utilize  their  conditional 
null  probabilities  to  order  the  tables.  These  relate  to  the  modified  P-value,  which  we 
discussed  in  Chapter  2.  The  same  argument  applies  if  one  uses  some  other  secondary 
statistic.  Let  Cg,  be  a constant,  depending  on  m,  such  that 

P{T  > > a and  P{T  > C4  = A < «. 

The  test  rejects  if  T > Ca.  When  T = C^,  consider  all  tables  having  T = 
and  order  the  tables  according  to  their  secondary  test  statistic  values.  When  the 
large  values  of  T contradict  the  null,  attention  can  be  given  to  the  tables  having 
larger  values  of  T'  among  the  tables  having  T = Ca-  For  another  case,  if  some 
table  has  small  probability  under  the  null  hypothesis,  it  implies  that  such  a table 
would  be  unlikely  to  occur  if  Ho  is  true.  And  for  a particular  value  of  T,  a smaller 
table  probability  under  the  null  corresponds  to  stronger  contradiction  to  the  null 
hypothesis.  Hence,  attention  can  be  given  to  the  tables  whose  null  probabilities  are 
less  probable  among  them  when  we  construct  the  rejection  region  using  the  null  table 
probability  for  the  secondary  statistic.  Thus,  when  T = Ca,  we  reject  for  those  tables 
whose  secondary  statistic  values  are  the  largest  or  whose  probabilities  are  smallest, 
and  whose  probabilities  total  at  most  (a  — A).  Instead  of  randomizing  on  all  tables 
where  T = Ca,  we  allow  randomization  only  at  extreme  points  of  a convex  acceptance 
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section  of  the  remaining  points,  so  that  the  test  is  exact,  unbiased,  and  admissible. 
We  denote  a test  of  this  form  by  ip*. 

Forming  the  critical  region  in  this  way  gives  a test  that  is  less  likely  to  require  ran- 
domization than  the  usual  test  ip  that  randomizes  on  the  entire  set  {n  ; m fixed,  T = 
Ca}-  Also,  the  modified  test  is  better  than  ip,  since  usually  the  entire  set  of  tables 
having  T = Ca  contains  nonextreme  points,  making  ip  inadmissible. 

In  this  section,  we  have  shown  that  a test  based  on  T using  monotone  scores 
satisfies  the  properties  of  Theorem  4.2.1,  since  a test  based  on  T has  a desirable 
monotonicity  property  by  the  construction  of  T,  and  we  allow  randomization  only 
at  extreme  points  of  the  convex  acceptance  section.  Hence,  the  test  ip*  is  exact, 
unbiased,  and  admissible.  A nonrandomized  test  using  T is  unbiased  and  admissible, 
but  it  would  be  conservative  when  used  with  a fixed  size  a.  But  the  test,  ip*,  would 
have  actual  size  closer  to  a nominal  level  than  the  ordinary  test. 

4.7  Example 


We  consider  the  test  of  conditional  independence  in  three-way  contingency  ta- 
bles, where  row  and  column  variables  are  ordinal.  We  assume  that  the  model  of  no 
three-factor  interaction  holds  and  we  can  construct  tests  to  increase  power  against 
important  alternatives.  We  will  illustrate  construction  of  an  exact,  unbiased,  and  ad- 
missible test  using  2x2x5  contingency  tables.  When  / = J = 2,  the  usual  statistic 
Efc^iu-  results  from  the  scores  uj  = Uj  = 1,  U2  = U2  = 0 in  T = 

When  I ~ J = 2,  the  test  ip*  gives  an  alternative  to  the  ordinary  one  for  testing 
conditional  independence  for  a set  of  2 x 2 tables,  under  the  assumption  of  a common 
odds  ratio.  The  ordinary  test  is  often  inadmissible.  For  an  / x J table,  we  can  con- 
struct exact,  unbiased,  and  admissible  tests  for  an  ordinal  alternative  to  independence 
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by  using  a modified  approach,  but  it  is  not  easy  to  display  the  acceptance  section  if 
/ and  J are  greater  than  3. 

4-7.1 — Test  of  Conditional  Independence  : 2x2x5  tables 

We  utilize  the  middle  three  subtables  of  Table  2.1  to  illustrate  construction  of 
an  exact,  unbiased,  and  admissible  test.  We  study  size  a = 0.05  tests  based  on 
T = Given  D and  C marginal  totals  at  each  level  of  P = | and  1,  nn2 

can  range  between  0 and  3,  nug  can  range  between  2 and  6,  and  nu4  can  have  5 
or  6.  The  whole  distribution  of  and  nn4  is  composed  of  40  tables.  Since 

P{T  > 13}  = 0.1136  > q;  and  P{T  > 13}  = 0.0200  < a,  randomization  is  required 
for  those  tables  with  T = 13.  We  use  Y.X'1  or  the  null  table  probability  for  the 
secondary  statistic.  Followings  are  the  tables  with  T > 13. 


1)T  = 15 

{n\v2 

^113 

^lu)  — 

3,6,6)  with  P(3,6,6)  = 

1452’ 

.XI 

11.09 

2)T  = 14 

(^*■112 

riii4)  — 

f (2,6,6) 

with 

P(2,6,6) 

_ 9 

1452’ 

E. 

XI 

= 7.54 

^^1135 

(3,5,6) 

with 

ms,  6) 

— 16 
1452  ’ 

E. 

XI 

= 6.59 

1 (3,6,5) 

with 

P(3,6,5) 

_ 2 
1452’ 

E. 

XI 

= 11.09 

3)T  = 1 

3 

' (1,6,6) 

with 

P(l,6,6) 

_ 9 

1452  ’ 

E. 

XI 

= 7.54 

(^ill2, 

(2,6,5) 

with 

P(2,6,5) 

_ 9 

1452’ 

E. 

XI 

= 7.54 

^*113, 

nn4)  = < 

(3, 5, 5) 

with 

P(3,5,5) 

— 16 
1452  ’ 

E. 

XI 

= 6.59 

(3,4,6) 

with 

P(3,4,6) 

_ 30 

1452’ 

E. 

XI 

= 5.09 

(2,5,6) 

with 

P(2,5,6) 

_ 72 

1452’ 

Efc 

XI 

= 3.04. 

123 


The  usual  0.05-size  conditional  test  based  on  T is 


I 


0 otherwise. 


1 if  (n„2,  nii3,  nii4)  - (3, 6,  6),  (2, 6, 6),  (3, 5, 6),  (3,  6, 5) 

0.3206  if  n„3,  n„4)  = (1, 6,  6),  (2, 5, 6),  (3, 4, 6),  (2,  6, 5),  (3, 5,  5) 


This  test  randomizes  with  equal  probability  on  all  tables  for  which  T = 13.  Since 
the  table  (2,5,6)  is  an  interior  point  of  line  segment  between  tables  (1,6,6)  and  (3,4,6), 
it  is  not  an  extreme  point  of  a convex  acceptance  region.  It  makes  inadmissible 
by  noting  that  randomization  should  occur  only  at  extreme  points  in  order  to  be 
admissible.  Hence,  another  test  ^p'  will  beat  the  test  ip. 


Since  the  table  (1,6,6)  has  the  largest  Y.k  value  or  the  smallest  null  table  prob- 
ability among  tables  for  which  T = 13,  it  can  be  included  in  the  rejection  region.  The 
table  (2,0,6)  is  now  an  extreme  point  for  this  test.  Since  randomization  is  permitted 
only  on  the  extreme  points  of  convex  acceptance  region,  it  is  admissible.  The  exact 
test  p that  orders  the  tables  according  to  their  secondary  statistic  values  is 


We  can  add  tables  into  the  rejection  region  until  the  probability  of  rejection  is  not 
greater  than  the  size.  Hence,  two  tables  (2,6,5)  and  (3,5,5)  are  entered  into  the  re- 
jection region  since  they  have  the  next  largest  Y.k  values  or  the  next  smallest  null 
table  probabilities.  Furthermore,  the  table  (2,5,6),  which  has  the  table  probability 


0 otherwise. 


1 if  (nii2,  ni43,  n„4)  = (3, 6,  6),  (2, 6, 6),  (3,  5, 6),  (3, 6, 5),  (1,6, 6), 

(2, 6, 5),  (3, 5, 5) 

0.3200  if  (nn-2,  ?iii4)  = (3,4,6) 


. 0 otherwise. 
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close  to  our  size,  can  be  excluded  from  randomization  so  that  the  table  (3,4,6)  is  the 
only  extreme  point  for  possible  randomization.  The  test  ip*  randomizes  only  on  an 
extreme  point  (3,4,6)  of  its  convex  acceptance  region,  and  it  satisfies  the  properties  of 
Theorem  4.2.1.  Hence,  it  is  exact,  unbiased,  and  admissible.  Compared  to  the  previ- 
ous test,  it  has  the  advantage  of  having  only  a single  table  for  which  randomization  is 
necessary.  The  probability  that  randomization  is  required  is  only  0.0207,  rather  than 
0.0937.  In  this  data  set,  we  get  the  same  results  of  exact,  unbiased,  and  admissible 
tests  using  either  Y,k  the  null  table  probability  for  the  secondary  statistic. 


4.8  Discussion 


For  / X J X K tables,  we  generalized  results  of  Cohen  and  Sackrowitz  (1992)  and 
showed  how  to  construct  exact,  unbiased,  and  admissible  tests  for  an  ordinal  alterna- 
tive to  conditional  independence.  The  ordinary  exact  test  of  conditional  independence 
for  2 X 2 X A tables  is  often  inadmissible.  In  practice,  randomized  tests  are  unac- 
ceptable. Thus,  even  the  tests  described  in  Section  6 that  require  less  randomization 
than  usual  are  not  intended  for  practical  use.  However,  results  of  that  section  suggest 
an  obvious  way  of  forming  critical  regions  for  tests  so  that  one  can  have  actual  size 
closer  to  a desired  value  (such  as  0.05)  than  would  be  possible  with  the  ordinary  test. 


CHAPTER  5 
CONCLUSION 


5.1  Discussion 


The  conservativeness  due  to  the  discreteness  of  a statistic  is  a typical  problem 
for  exact  inference  with  categorical  data.  Ways  of  reducing  the  conservativeness  in 
exact  tests  and  confidence  intervals  were  proposed  in  Chapter  2.  We  prefer  modified 
exact  tests  and  confidence  intervals  to  the  ordinary  exact  ones  because  they  are  less 
conservative  than  the  ordinary  ones,  but  still  guarantee  at  least  the  nominal  level.  We 
also  prefer  confidence  intervals  based  on  inverting  two-sided  tests  over  those  based  on 
inverting  two  separate  one-sided  tests  because  they  tend  to  be  less  conservative. 

The  approach  using  a modified  P-value  can  be  utilized  in  approximating  exact 
inference  regarding  conditional  associations  in  / x J x K tables.  In  Chapter  3 we 
discussed  six  test  statistics  for  conditional  independence.  We  obtained  precise  esti- 
mates of  ordinary  and  modified  exact  P-values  by  using  a simulation  algorithm  for 
cases  that  currently  are  computationally  infeasible. 

For  / X J X A tables,  Chapter  4 discussed  construction  of  tests  of  conditional 
independence  that  are  exact,  unbiased,  and  admissible  for  an  ordinal  alternative. 
By  using  a modified  approach,  less  randomization  is  required  than  usual,  and  we 
obtain  actual  size  closer  to  a nominal  level.  The  ordinary  exact  test  of  conditional 
independence  for  2 x 2 x A tables  is  often  inadmissible,  and  we  showed  how  to  obtain 
improved  tests. 
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5.2  Future  Research 


We  have  considered  improved  “exact”  inference  about  conditional  association  in 
2 X 2 X A contingency  tables.  The  idea  of  a modified  P-value  can  be  applied  to  any 
contingency  tables,  and  it  can  be  calculated  for  any  test  statistic  having  a discrete 
distribution.  One  research  study  could  be  the  application  of  the  modified  approach 
to  exact  tests  of  no  three-factor  interaction.  Zelen  (1971)  presented  an  exact  test  of 
homogeneity  of  odds  ratios  in  2 x 2 x K tables.  For  an  exact  test  of  no  three-factor 
interaction  for  2 x 2 x A tables,  an  efficient  score  statistic  against  the  saturated  model 
is  the  Pearson  statistic  for  testing  the  fit  of  that  model  (Agresti  1992).  We  could  use 
this  score  statistic  as  a primary  statistic  and  the  table  probability  as  a secondary 
statistic  to  define  modified  P-values.  We  could  study  how  much  improvement  can  be 
obtained  by  using  a modified  approach. 

We  could  consider  a modified  confidence  interval  for  the  /?  parameter  in  the  linear- 
by-linear  association  model.  Under  the  alternative,  the  conditional  distribution  of 
T = J2HuiVjn,j  has  a noncentral  hypergeometric  distribution  (2.10),  where  = 9, 
and  C(  is  the  sum  of  (firinij!)  ^ for  all  tables  with  given  marginal  distributions  having 
T — to  (Agresti  et  al.  1990).  By  using  a modified  confidence  interval,  we  could  reduce 
the  conservativeness  of  the  Agresti-Mehta- Patel  interval. 

As  we  mentioned  in  Section  2.4.1  for  2 x 2 x K tables,  we  could  base  confidence 
intervals  on  tests  in  which  the  two-sided  P-value  uses  a non-null  test  statistic,  instead 
of  the  table  probability.  For  instance,  we  could  consider  a test  statistic 

X)  A:  I 

EV^(niu)  ’ 

where  under  the  alternative  of  assuming  0,  mju  is  the  mean  of  Uiu,  and  V{nuk) 

IS  its  variance.  Since  for  a fixed  value  of  9,  EV^(niu)  is  a constant,  T{9)  depends 
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only  on  its  numerator.  By  using  the  exact  non-null  distribution,  we  could  construct 
a two-sided  ordinary  or  modified  confidence  interval. 

Another  area  to  consider  is  how  we  can  apply  important  sampling  (Mehta,  Patel, 
and  Senchaudhuri  1988)  as  an  alternative  to  conventional  Monte  Carlo  sampling  to 
simulate  the  exact  distribution  and  to  estimate  exact  significance  levels.  In  impor- 
tance sampling,  the  tables  are  selected  in  proportion  to  their  importance  for  reducing 
the  variance  of  the  estimated  Monte  Carlo  P-values,  whereas  in  Monte  Carlo  sam- 
pling, the  tables  are  sampled  independently  with  replacement  from  the  reference  set. 
The  accuracy  and  the  speed  will  be  increased  by  using  importance  sampling. 

We  could  use  a simulation  algorithm  to  approximate  exact  confidence  intervals. 
Then,  we  need  to  have  an  algorithm  to  simulate  the  non-null  distribution.  Under  the 
alternative,  the  joint  probability  distribution  of  a table  has  a noncentral  hypergeo- 
metric  distribution,  and  random  tables  should  satisfy  the  association  structure  as  well 
as  the  fixed  margins.  As  we  construct  an  “exact”  confidence  interval  for  a parameter 
by  inverting  the  results  of  the  exact  conditional  tests  based  on  ordinary  or  modified 
exact  P-values,  we  can  approximate  exact  confidence  intervals  for  a parameter  by  the 
same  method  based  on  the  estimate  of  ordinary  or  modified  exact  P-values. 

Also,  we  could  approximate  exact  inference  for  the  test  of  no  three-factor  interac- 
tion. In  this  case  the  conditional  reference  set  is  the  set  of  / x J x K tables  whose 
X\,XZ,VZ  marginal  tables  are  fixed  at  the  corresponding  values  of  the  observed 
tables.  More  power  would  be  obtained  for  narrower  alternatives  that  utilize  ordinality. 

For  the  test  of  conditional  independence  in  / x J x K tables,  we  defined  the  class 
of  exact,  unbiased,  and  admissible  tests.  There  are  other  null  hypotheses  of  interest. 
We  could  consider  the  class  of  exact,  unbiased,  and  admissible  tests  for  testing  no 
three-factor  interaction  against  an  ordinal  alternative. 

In  summary,  we  suggested  exact  inference  regarding  conditional  associations  in 
three-way  tables,  modifying  the  usual  exact  conditional  approach.  This  seems  to  be  a 
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promising  approach  for  categorical  data  analysis,  and  more  work  can  be  done  utilizing 
this  approach. 


APPENDIX  A 

SOURCE  CODE  EOR  EXACT  INEERENCE 


Following  are  FORTRAN  source  code  for  computing  the  ordinary  and  modified 
exact  P-values,  four  types  of  confidence  intervals,  and  coverage  probability.  Data 
or  its  file  name  can  be  entered  from  console,  and  this  program  provides  four  types 
of  confidence  intervals  or  coverage  probability  by  the  option.  When  the  coverage 
probability  is  requested,  it  makes  five  output  files.  They  are  “00. Cl”  for  one-sided 
ordinary  exact  confidence  interval,  “OM.CI”  for  one-sided  modified  exact  confidence 
interval,  “TO. Cl”  for  two-sided  ordinary  exact  confidence  interval,  “TM.CI”  for  two- 
sided  modified  exact  confidence  interval,  and  “COVER.P”  for  coverage  probability 
for  four  types  of  confidence  intervals.  This  program,  for  2 x 2 x K tables,  is  an 
adaptation  of  one  written  by  Vollset  and  Hirji  (1991)  for  ordinary  exact  inference. 


integer  itab(1000,4) ,I0T0T 

INTEGER  NIK(2, 1000) ,NJK(2,1000) ,NT0T(1000) 

INTEGER  ISUMA,J,SCD 
INTEGER*2  JH3 , JM3 , JS3 , JSS3 

integer  infhyl(lOOO) , infhyu(lOOO) ,INUM(270000, 20) , INUMl (270000 , 1) 


double  precision  hyp(0 : 2000) ,ds(0 : 1 ,0 : 5500) ,ddl ,lge 

DOUBLE  PRECISION  C(5500) ,B,LLL,K,FF,R1 (5) ,R2(5) 

DOUBLE  PRECISION  ROOTRF , EPS ,X0 ,X1 ,X3 

DOUBLE  PRECISION  LL,UL,MH,MUE,KA(100,2) ,RUL,RLL,MIDP,MAXP,PVAL2 
DOUBLE  PRECISION  FLOWER, FUPPER,FL0WB0,FUPPB0,MAXPE,pobsh,0DR 
DOUBLE  PREC I S I ON  ALPHA , VRBG , SVRBG , ALL , AUL , START , ELL , EUL , ELL 1 , EUL 1 
DOUBLE  PRECISION  PALPHA,P_UP,P_L0,P_UP1,P_L01 
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DOUBLE  PRECISION  P_UP2 ,P_L02 ,ELL2 ,EUL2 

DOUBLE  PRECISION  OOCI (270000 , 2) , OMCI (270000 , 2) ,TOCI (270000 , 2) 
DOUBLE  PRECISION  TMCI (270000 , 2) , C0VER(1600 , 5) 

DOUBLE  PRECISION  HYPDIO (270000 , 1) ,POOCI , POMCI , PTOCI ,PTMCI 
DOUBLE  PRECISION  ELL3,EUL3,P_L03,P_UP3,DALL 
DOUBLE  PRECISION  DENO ,hypd(1000 , 0 : 2000) , POBSHl , PEXIMP , PEX 
DOUBLE  PRECISION  HYPD2 (270000 , 20) ,HYPD1 (270000 , 1) 

DOUBLE  PRECISION  CHI (270000) ,CHIOBS 

CHARACTER* 16  FNAME 
COMMON/PARAM/C , J , SCD , K , FF 

C0MM0N/CIl/ik,mxs,mxz,mxd,lge,itab,hyp,ds,ipar,kl,k2,ierr,pobsh 

COMMON  /CH/  NIK,NJK,NTOT 
COMMON  /ART/  lOTOT 

COMMON/ CI2/hypd , inf hyl , inf hyu , POBSHl , PEXIMP , PEX 
COMMON  /DKIM/  DENO , ITOT, ISUML, INUM,HYPD2 , INUMl ,HYPD1 
COMMON  /CHI/  CHI,CHIOBS 


c 

C 

c 

c: 


C 

C 

C 

C 

C 

C 


EXTERNAL  FLOWER , FUPPER , FLOWBO , FUPPBO 


DATA  LGE  /307.0D+00/ 

DATA  KA(95,1)  /3 . 84145882D+00  / KA(95,2)  /2.5D-02/ 
DATA  KA(90,1)  /2 . 70554345D+00  / KA(90,2)  /5.0D-02/ 
DATA  KA(99,1)  /6 . 63489660D+00  / KA(99,2)  /5.0D-03/ 
DATA  KA(80,1)  /I . 64237442D+00  / KA(80,2)  /l.OD-01/ 
DATA  KA(50,1)  /O . 45493642D+00  / KA(50,2)  /2.5D-01/ 
DATA  MIDP  /O.DOO/,  MAXP  /O.DOO/ 

FF  IS  1 FOR  EXACT  AND  0.5  FOR  MID-P  EXACT 
IMAX  MAX  NO.  OF  ITERATIONS 
K ALPHA/2 

EPS  STOPPING  CRITERION 


LGE=307.0D+00 
WRITE(*, 10000) 

10000  F0RMAT(3(/)  ,T12, '*****  Ex2x2xK  (version  24.0  — 5/94)  ***i^^*>  J ^ 

1 /,T12, 'Ordinary  and  Modified  Exact  P-values  and  CIs',/, 

2 T12,'for  several  2x2  tables.',/, 

3 T12,'  One-sided  and  Two-sided  Approach  : ',/) 


write(* , 10001) 


10001  F0RMAT(T7,'  This  program  calculates',/, 

1 T7,'  1.  Ordinary  and  Modified  Exact  P-values,  ',/, 

2 T7,'  2.  Four  Types  of  Exact  Confidence  Limits',/, 

3 T7 , ' for  the  Common  Odds  Ratio,  and',/, 

4 T7,'  3.  Coverage  Probability  for  CIs.') 

WRITE(*, 10002) 

10002  F0RMAT(/,T7, ' The  program  of  Vollset,  Hirji,  Elashoff  is  ', 

/>T7,  graciously  provided  and  slightly  modified.', 

+ /,T7,'  Several  routines  are  added  for  modified  ', 

+ 'exact  inference.') 

WRITE(*, 10004) 

10004  FORMATC  /,/,T7,'  Any  questions  about  the  use  of  this  software 
+ /,T7,'  can  be  directed  to  Dr.  Alan  Agresti  or  Donguk  Kim.') 


C 

C 


lA  = 95 


l-ALPHA/2 


ALPHA=1 . DO-DBLE(IA) /lOO . DO 


1 FF  = O.D+00 
IK=0 
IMAX=50 
K = KA(IA,2) 
EPS  = l.D-09 


c intrinsic  functions  : f loat () ; dexpO 
c 

iin=5 

iot=6 

c 

c maximum  number  of  strata  = 1000 
c maximum  value  of  range  of 

c hypergeometric  distribution  = 2000 

c maximum  value  of  range  of 

c final  distribution  = 5500 


mxs  = 1000 
mxz  = 2000 
mxd  = 5500 


c 


c maximum  stratum  size 
c 

mxss  = 500000 


C 

C READ  DATA 

C 

10  WRITE(I0T,999) 

WRITE(I0T,997) 

READ(IIN,15)FNAME 



C FNAME= 'peni . dat ' 



15  F0RMAT(A16) 

IF(FNAME  .EQ.  ’c’  .OR.  FNAME  .EQ.  'COTHEN 

PRINT  *,'GIVE  l-ALPHA/2:  50,80,90,95  OR  99)' 
READ(IIN,16)IA 

16  F0RMAT(I2) 

GOTO  1 

END  IF 

IF  (FNAME  .EC),  'k'  .OR.  FNAME  .EQ.  'KOTHEN 
CCCCCCCCCCCCCCCCCCCCCCCCCCCCC 
18  write(iot,20) 

20  format(/lOx, 'Enter  no . of  strata') 

read(iin,*) ik 
if  (ik  .It.  1)  goto  10 
open(unit=28,f ile=' 2x2 .dat ' ) 
do  30  i=l,ik 

write(iot,40)i 

40  format (/lOx, 'Enter  table' , lx, i3) 

read(iin,*)itab(i,l) ,itab(i,2) ,itab(i,3) ,itab(i,4) 

write (iot,*) (itab(i , j ) , j=l ,4) 
write(28,*) (itab(i , j ) , j=l ,4) 

NIK(1,I)=ITAB(I,1)+ITAB(I,2) 

NIK(2,I)=ITAB(I,3)+ITAB(I,4) 

NJK(1,I)=ITAB(I,1)+ITAB(I,3) 

NJK(2,I)=ITAB(I,2)+ITAB(I,4) 


NT0T(I)=NIK(1,I)+NIK(2,I) 


30  continue 

GOTO  100 
END  IF 

cccccccccccccccccccccccccccccccccc 

OPEN (UNIT=27 , FILE=FNAME) 




C 0PEN(UNIT=27,FILE='peni .dat ' ) 




DO  70  1=1, MXS 

READ(27,*,END=100)(ITAB(I,J),  J=l,4) 

IK=IK+1 

WRITE(I0T,*)(ITAB(I,J),J=1,4) 

NIK(1,I)=ITAB(I,1)+ITAB(I,2) 

NIK(2,I)=ITAB(I,3)+ITAB(I,4) 

N JK ( 1 , I ) =ITAB (1,1 ) +ITAB (1,3) 

NJK(2,I)=ITAB(I,2)+ITAB(I,4) 

NT0T(I)=NIK(1,I)+NIK(2,I) 

70  CONTINUE 

100  PRINT  *, 'NO. STRATA MK 
WRITE(*,80) 

80  FORMAT(/, 'ENTER  CODE  FOR  ANALYSIS  ; ' , 

1 />/.'  1:  P-VALUE  AND  CONFIDENCE  INETRVAL', 

2 /,’  2:  COVERAGE  PROBABILITY  FOR  CIS. ' ,/) 

READ(*,*)NCODE 

IF  (NCODE  .EQ.  1)  GO  TO  110 

lERR  = 0 
j ci=0 

call  cnv2x2(ik,mxs,mxz,mxd,lge,itab,hyp,ds,ipar,kl,k2,ierr,pobsh, 

1 jci,odr) 
print* 

print*, 'TOTAL  NO.  OF  RANDOM  TABLES  =',iotot 
C 

C COMPUTE  CIS  FOR  EACH  RANDOM  TABLE 
C 
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OPEN (UNIT=45 , FILE= ' 00 . Cl  0 
OPEN (UNIT=46 , FILE= ' OM . Cl  0 
OPEN (UNIT=47 , FILE= ' TO . Cl ' ) 

OPEN (UNIT=48 , FILE= ' TM . Cl ' ) 

DO  5000  IAITN=1,I0T0T 
C print*, 'data' 

C PRINT*, 'NO.  OF  RANDOM  TABLE  =',IAITN 

DO  5010  1=1, IK 

ITAB(I,1)=INUM(IAITN,I)+INFHYL(I) 

ITAB(I,2)=NIK(1,I)-ITAB(I,1) 

ITAB(I,3)=NJK(1,I)-ITAB(I,1) 

ITAB(I,4)=NT0T(I)-(ITAB(I,1)+ITAB(I,2)+ITAB(I,3)) 

C print*,itab(i,l) ,itab(i,2) ,itab(i,3) ,itab(i,4) 

5010  CONTINUE 

ff=0.d0 

no  lERR  = 0 

C CALL  GETTIM(JH1,JM1,JS1,JSS1) 
j ci=0 

cnv2x2 (ik ,mxs ,mxz ,mxd, Ige , itab ,hyp , ds , ipar ,kl , k2 , ierr ,pobsh, 
1 jci,odr) 


IF  (IERR  .GT.  0)  THEN 

CALL  ERROR ( I ERR , MX  s , mx  s s , MXZ , MXD ) 

GOTO  170 
END  IF 

C CALL  GETTIM(JH2, JM2, JS2, JSS2) 

C ITIME  = 60*60*(JH2-JH1)  + 60*(JM2-JM1)  + JS2-JS1 

c ATIME  = FLOAT(ITIME)  + FLOAT( JSS2- JSSl) /lOO . 0 
170  CONTINUE 

C CALCULATES  OBSERVED  POSITION  IN  SAMPLE  SPACE 
C J - POSITION 
C SCD  - SIZE  COND.  SAMPLE  SPACE 

C ADDED  BY  DONGUK  KIM 
ISUMA=0 
C 

DO  180  1=1, IK 
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ISUMA  = ISUMA  + ITAB(I,1) 

180  CONTINUE 

J = ISUMA  - K1  + 1 
SCO  = K2  - K1  + 1 

IF  (NCODE  .EQ.  1)  THEN 
print* 

PRINT* , ' 

print*, 'DISTRIBUTION  in  T : ',kl , ' <=  T <=',k2,scd,'  values' 
print* ,' OBSERVED  T is  in  ',j,'  th  position  among  ',scd 
print*, 'OBSERVED  PRIMARY  TEST  STATISTIC  =',isuma 
print*, 'OBSERVED  SECONDARY  TEST  STATISTIC  = ' , SNGL (CHIOBS) 

PRINT*  

print* 

PRINT*, ' 

WRITE(*,906)PEXIMP 
WRITE (*, 907) PEX 
PRINT* 

PRINT*, 'PROB.  OF  OBSERVED  TABLES  =' ,SNGL(P0BSH1/DEN0) 

PRINT*,  ' 

PRINT* 

906  FORMAT('THE  MODIFIED  EXACT  P-VALUE  =',3X,F12.6) 

907  FORMATC'THE  ORDINARY  EXACT  P-VALUE  =',3X,F12.6) 

ENDIF 

DO  210  I=K1,K2 
C(I-Kl+l)=DS(IPAR,I-kl) 

210  CONTINUE 

IF  (NCODE  .EQ.  1)  THEN 
open(unit=29,f ile='dist .fx5' ) 

DO  211  I=K1,K2 

write(29,*)i-kl+l,C(i-kl+l) ,DEXP(C(i-kl+l)) ,I 

211  CONTINUE 
ENDIF 

C 

C IP0S=1  IF  OBSERVED  IS  ON  LOWER  BOUNDARY,  2 ON  UPPER  0 OW 
C 

IP0S=0 

IF(J  .EQ.  1)  IPOS  =1 
IF(J  .EQ.  SCD)  IP0S=2 
C PRINT  *, 'IPOS' ,IPOS 

C 
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C CALCULATE  STARTING  VALUES 
C 

C CALL  SATO (ITAB, IK, LL,UL,MH,KA,IA,RLL,RUL, IPOS) 

CALL  SATO ( ITAB , IK , LL , UL , MH , KA , lA , RLL , RUL , IPOS , VRBG) 

Rl(3)  = UL 
R2(3)  = LL 
Rl(4)  = RUL 
R2(4)  = RLL 

C P-VALUES 

IFCIPOS  .GE.DGOTO  220 
FF=0 . 5 

MIDP=FLOWER(O.DOO)+K 

PVAL2=FUPPER(0.D00)+K 

IF(MIDP  .GT.  PVAL2)MIDP=PVAL2 

MIDP=MIDP*2 

FF=1.0 

MAXP=FLOWER (0 . DOO ) +K 
PVAL2=FUPPER(0 .DOO)+K 
IF(MAXP  .GT.  PVAL2)MAXP=PVAL2 
MAXPE=MAXP 

C PRINT*, 'ONE  SIDED  P_EXACT  ' ,MAXPE 
C 

MAXP=MAXP*2 
220  FF=1.0 

IFdPOS  .EQ.  DTHEN 

MAXP=2* (FLOWBO (0 . DOO) +K) 

FF=0.5 

MIDP=2* (FLOWBO (0 . DOO) +K) 

ENDIF 

FF=0.5 

IFCIPOS  .EQ.  2)THEN 

MIDP=2* (FUPPBO (0 . DOO)+K) 

FF=1.0 

MAXP=2* (FUPPBO (0. DOO)  +K) 

ENDIF 

FF=0 



DO  1000  JJ=1,2 
FF=FF+0.5 

C PRINT  * , ' FF ' , FF 

IFCIPOS  .EQ.  1)  GOTO  300 
IFCIPOS  .EQ.  2)  GOTO  310 

C M.U.E 

IF(FF  .EQ.  0.5)  THEN 
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K = 4.999999999999999D-001 
XO  = DLOG(MH) 

C PRINT  'MH' ,MH 

CALL  brent (XO , EPS , IMAX , ROOTRF , FLOWER , NRF) 
HUE  = ROOTRF 

C PRINT  'M.U.E' ,ROOTRF 

K = KA(IA,2) 

C PRINT  'K' ,K 

END  IF 



XO  = DLOG(UL) 

CALL  brent (XO , EPS , IMAX , ROOTRF , FLOWER , NRF) 

Rl(JJ)=ROOTRF 

XO  = DLOG(LL) 

CALL  brent (XO , EPS , IMAX , ROOTRF , FUPPER ,NRF) 

R2(JJ)=R00TRF 

GOTO  340 

300  XO  = DLOG(UL) 

CALL  brent (XO , EPS , IMAX , ROOTRF , FLOWBO ,NRF) 

R1(JJ)=R00TRF 

GOTO  341 

310  XO  = DLOG(LL) 

CALL  brent (XO , EPS , IMAX , ROOTRF , FUPPBO , NRF) 

R1(JJ)=R00TRF 

GOTO  342 



340  IF(FF  .EQ.  0.5)  GOTO  1000 

IF  (NCODE  .EQ.l)  THEN 
WRITE(*,999)IA 
PRINT  POINT  ESTIMATES' 

PRINT  ' 

WRITE ( * , 979 ) MH , DEXP (MUE) 

PRINT  ' 

WRITE(*,993)MAXPE 
WRITE (*, 994) POBSH 
PRINT  * , ' ' 

PRINT  * , ' INTERVAL  ESTIMATES  LOWER 

+ 2+ONESIDED  P' 

PRINT  ' ' 

WRITE(*,991)DEXP(R2(2)) ,DEXP(R1(2)) ,MAXP 
WRITE(*,992)DEXP(R2(D)  ,DEXP(R1(D)  ,MIDP 
WRITE(*,980)R2(3) ,R1(3) 

WRITE(*,981)R2(4) ,R1(4) 


UPPER 
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ENDIF 


GOTO  343 

IF  (FF  .EQ.  0.5)G0T0  1000 


IF  (NCODE  .EQ.l)  THEN 
WRITE(*,999)IA 

PRINT  LOWER  BOUNDARY:  UPPER  LIMITS  ONLY' 

PRINT  ' 

PRINT  * , ' INTERVAL  ESTIMATES  LOWER 

2*0NESIDED  P' 

WRITE(*,971)DEXP(R1(2)) ,MAXP 
WRITE(*,972)DEXP(R1(D)  ,MIDP 
WRITE(*,973)R1(3) 

ENDIF 


GOTO  343 

342  IF(FF  .EQ.  0.5)G0T0  1000 


IF  (NCODE  .EQ.  1)  THEN 
WRITE(*,999)IA 

PRINT  UPPER  BOUNDARY:  LOWER  LIMITS  OMLY’ 
PRINT  * , ' ' 

PRINT  * , ' INTERVAL  ESTIMATES  LOWER 

2*0NESIDED  P' 

WRITE(*,974)DEXP(R1(2)) ,MAXP 
WRITE(*,975)DEXP(R1(D)  ,MIDP 
WRITE(*,976)R2(3) 

ENDIF 


343  CONTINUE 

IF  (NCODE  .EQ.  1)  THEN 

PRINT*, ' 

ENDIF 


971 

FORMAT (' 

MAX-P  EXACT 

972 

FORMAT (' 

MID-P  EXACT 

973 

FORMAT (' 

MANTEL-HAENSZEL- 

974 

FORMAT (' 

MAX-P  EXACT 

975 

FORMAT (' 

MID-P  EXACT 

976 

FORMAT (' 

MANTEL-HAENSZEL- 

979 

FORMAT (' 

MANTEL-HAENSZEL  ^ 

+ 

' MEDIAN  UNBIASED  =' ,F12 

980 

FORMAT (' 

MANTEL-HAENSZEL-; 

981 

FORMAT(' 

MANTEL-HAENSZEL-] 

M7X,2(5X,F12.6)) 
M7X,2(5X,F12.6)) 
ITO' ,17X,5X,F12.6) 

' ,5X,F12.6,17X,F12.6) 
' ,5X,F12.6,17X,F12.6) 
ITO' ,5X,F12.6) 

' ,F12.6, 


,2(5X,F12.6)) 


UPPER 


UPPER 
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988 

990 
C 990 

991 

992 

993 

994 

995 

996 

997 


+ge 


FORMATC/,/,'  MODIFED  P-VALUE  ' , 3(5X ,F12 . 6) ) 

FORMAT ('MODIFED  EXACT  Cl  USING  BRENTl ' , 2(5X,F12 . 6) ) 
FORMATC/,/,'  MID  P CORRECTED  P EXACT  ' ,3(5X,F12 .6)) 

FORMATC'  MAX-P  EXACT  ' , 3C5X,F12 . 6) ) 

FORMATC'  MID-P  EXACT  ' , 3C5X ,F12 . 6) ) 

FORMATC'  ONE  SIDED  P EXACT  ',5X,F12.6) 

FORMATC'  PROB  OF  OBSERVED  TABLES ', 2X ,F12 . 6) 
F0RMATC2Cl0X,F12.6)) 

F0RMATC10X,F12.6) 

FORMAT ClOX, 'ENTER  FILENAME  Ck  for  keyboard  entry  - c to  chan 
alpha-level) ' ,4C/)) 


998  FORMAT C/2X, 'ELAPSED  TIME  CSECS)  = ',F8.2,'  + ',F8.2,'  = ' 

+ F8.2) 

999  F0RMATC3C/),2X,I2, 

+ '•/.  TWO-SIDED  EXACT  CONFIDENCE', 

+ ' LIMITS  FDR  THE  COMMON  ODDS  RATIO' 

+ 2C/)) 

1000  CONTINUE 


IF  CNCODE  .EQ.  1)  THEN 
WRITE C*, 350) 

FORMATC/,/, 'MODIFIED  EXACT  CONFIDENCE  INTERVAL  CY=1,N=0)  J) 
READC*,*)NC0DE1 

IF  CNCDDEl  .NE.  1)  THEN 
PRINT*, 'END' 

GO  TO  1002 
ENDIF 
END  IF 


JCI=1 


C ONE-SIDED  MODIFIED  P CONFIDENCE  INTERVAL. 


2001  CONTINUE 
C2001  WRITEC*,2005) 

2005  FORMATC/,/, 'MODIFIED  EXACT  CONFIDENCE  LIMITS  FOR  ',/, 
1 'THE  COMMON  ODDS  RATIO  USING  ITERA  CY=1,N=0)  ?',/,/) 
c READC*,*) JSCIl 
JSCI1=1 

C PRINT*, 'JSCI1=1' 


IF  CJSCIl  .EQ.  0)  GO  TO  3001 
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C INITIAL  VALUE  FOR  ITERAl  IS  THE  LIMITS  FROM  ORDINARY  EXACT  Cl 
ALL=DEXP(R2(2)) 

AUL=DEXP(R1(2))*1.1D0 
C AUL=DEXP(R1(2)) 

C print*, 'INITIAL  VALUE  USING  ORDINARY  EXACT  Cl  = ',all,aul 

C COMPUTE  LOWER  LIMIT 
ist=l 
JCI0=2 

IF  (J  .EQ.  SCD)  ALL=DEXP(R1(2)) 

IF  (J  .EQ.  1)  THEN 
ELLl=O.DO 
P_L01=1.D0 

C PRINT*, 'LOWER  LIMIT  =' ,ELL1 

C PRINT* 

GO  TO  2006 
END  IF 

C PRINT*, 'INITIAL  VALUE  FOR  THE  LOWER  LIMIT  = ' ,ALL 

CALL  ITERA 1 (ALPHA , ALL , ELL 1 , ist , JCI 0 , PALPHA) 

P_L01=PALPHA 

C print*, 'lower  limit  elll  from  ITERAl  =',elll 

C PRINT* 

C COMPUTE  UPPER  LIMIT 
c START=1.D0/AUL 

2006  START=AUL 
ist=2 
JCI0=1 

IF  (J  .EQ.  SCD)  THEN 
EUL1=99999. 999999 
P_UP1=1.D0 

C PRINT* ,' UPPER  LIMIT  =',EUL1 

GO  TO  2007 
ENDIF 

C PRINT*, 'INITIAL  VALUE  FOR  THE  UPPER  LIMIT  = ' ,AUL 

C print* ,' start=' , start 

CALL  ITERAl (ALPHA , START , EULl , ist , JCIO , PALPHA) 

P_UP1=PALPHA 

C print*, 'upper  limit  eull  from  ITERA1= ' , eull 

2007  CONTINUE 
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IF  (NCODE  .EQ.  2)  GO  TO  3001 
PRINT* 

PRINT*,  ' 

WRITE(*,2010)  ELL1,EUL1 

PRINT*, 'P-VALUE  FOR  THE  LIMIT  (low, up)  =>  ,P  L01,P_UP1 

PRINT*,  ' 1___! , 

PRINT* 

2010  FORMAT ('ONE-SIDED  MODIFIED  EXACT  Cl  ' , 2 (5X ,F12 . 6) ) 

C TWO-SIDED  ORDINARY  P CONFIDENCE  INTERVAL. 

C 

C The  P-value  is  the  sum  of  the  either  tail. 


3001  CONTINUE 
C3001  WRITE(*,3600) 

3600  FORMAT(/,/, 'TWO-SIDED  ORDINARY  EXACT  CONFIDENCE  LIMITS  ', 

1 'FOR  THE  COMMON  ODDS  RATIO  (Y=1,N=0)  ?',/) 
c READ(*,*) JTSCI 

I00T0=1 
JTSCI=1 

C PRINT*, 'JTSCI=1' 

IF  (JTSCI  .EQ.  0)  GO  TO  1001 

C STARTING  VALUES  ARE  LIMITS  FOR  ORDINARY  EXACT  Cl. 
ALL=DEXP(R2(2)) 

AUL=DEXP(R1(2))*1.1D0 
C AUL=DEXP(R1(2)) 

C print*, 'INITIAL  VALUE  USING  ORDINARY  EXACT  Cl  = ',all,aul 

C COMPUTE  LOWER  LIMIT 
ist=l 

IF  (J  .EQ.  1)  THEN 
ELL3=0.D0 
P_L03=1 .DO 

C PRINT*, 'LOWER  LIMIT  =',ELL3 

C PRINT* 

GO  TO  3006 
ENDIF 


142 


IF  (J  .EQ.  SCD)  ALL=DEXP(R1(2)) 

C PRINT*, 'INITIAL  VALUE  FOR  THE  LOWER  LIMIT  = ' ,ALL 

CALL  ITERA (ALPHA , ALL , ELL3 , ist , PALPHA , lOOTO) 
P_L03=PALPHA 

C print*, 'lower  limit  ell3  from  ITERA  =',ell3 

C PRINT* 

C COMPUTE  UPPER  LIMIT 
c START=l.DO/AUL 

3006  START=AUL 

ist=2 

IF  (J  .EQ.  SCD)  THEN 
EUL3=99999. 999999 
P_UP3=1 .DO 

C PRINT*, 'UPPER  LIMIT  =' ,EUL3 

GO  TO  3007 
END  IF 

C print* , ' start= ' , start 

C PRINT*, 'INITIAL  VALUE  FOR  THE  UPPER  LIMIT  = ',AUL 
CALL  ITERA (ALPHA , START , EUL3 , ist , PALPHA , lOOTO) 
P_UP3=PALPHA 

C print*, 'upper  limit  eul3  from  ITERA  =',eul3 

3007  CONTINUE 

IF  (NCODE  .EQ.  2)  GO  TO  1001 
PRINT* 

PRINT*, ' 

WRITE (*,3989)  ELL3,EUL3 

PRINT*, 'P-VALUE  FOR  THE  LIMIT  (low, up)  =\P_L03,P_UP3 

PRINT*,  ' 1 

PRINT* 

3989  FORMAT( 'TWO-SIDED  ORDINARY  EXACT  Cl  ' , 2 (5X ,F12 . 6) ) 


C TWO-SIDED  MODIFIED  P CONFIDENCE  INTERVAL. 
C 

C The  P-value  is  the  sum  of  the  either  tail. 
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1001  CONTINUE 
ClOOl  WRITE(*,600) 

600  F0RMAT(/,/, 'TWO-SIDED  MODIFIED  EXACT  CONFIDENCE  LIMITS  \ 

1 'FOR  THE  COMMON  ODDS  RATIO  (Y=1,N=0)  ?',/) 
c READ(*,*) JSCI 
I00T0=2 
JSCI=1 

C PRINT*,  MSCI=1' 

IF  (JSCI  .EQ.  0)  GO  TO  1002 

C STARTING  VALUES  ARE  LIMITS  FOR  ORDINARY  EXACT  Cl . 
ALL=DEXP(R2(2)) 

AUL=DEXP(R1(2))*1.1D0 
C AUL=DEXP(R1(2)) 

C print*,  INITIAL  VALUE  USING  ORDINARY  EXACT  Cl  = \all,aul 

C COMPUTE  LOWER  LIMIT 
ist=l 

IF  (J  .EQ.  1)  THEN 
ELL2=0.D0 
P_L02=1.D0 

C PRINT*, 'LOWER  LIMIT  =',ELL2 

C PRINT* 

GO  TO  1006 
ENDIF 

IF  (J  .EQ.  SCD)  ALL=DEXP(R1(2)) 

C PRINT*, 'INITIAL  VALUE  FOR  THE  LOWER  LIMIT  = ' ,ALL 

CALL  ITERA (ALPHA , ALL , ELL2 , i st , PALPHA , lOOTO) 

P_L02=PALPHA 

C print*, 'lower  limit  ell2  from  ITERA  =',ell2 

C PRINT* 

C COMPUTE  UPPER  LIMIT 
c START=1 .DO/AUL 

1006  START=AUL 
ist=2 


IF  (J  .EQ.  SCD)  THEN 
EUL2=99999. 999999 
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P_UP2=1 .DO 

C PRINT*, 'UPPER  LIMIT  =' ,EUL2 

GO  TO  1007 
ENDIF 

C print*, ' start=' , start 

C PRINT*, 'INITIAL  VALUE  FOR  THE  UPPER  LIMIT  = ' ,AUL 

CALL  ITERA (ALPHA , START , EUL2 , ist , PALPHA , lOOTO) 

P_UP2=PALPHA 

C print*, 'eul  = ',eul 

c EUL=1.D0/EUL 

C print*, 'upper  limit  eul2  from  ITERA  =',eul2 

1007  CONTINUE 

IF  (NCODE  .EQ.  2)  GO  TO  1008 
PRINT* 

PRINT* , ' 

WRITE (*,989)  ELL2,EUL2 

PRINT*, 'P-VALUE  FOR  THE  LIMIT  (low, up)  =',P_L02,P  UP2 

PRINT*,  ' 1_I , 

PRINT* 

PRINT* , ' END ' 

989  F0RMAT( 'TWO-SIDED  MODIFIED  EXACT  Cl  ' , 2 (5X ,F12 . 6) ) 

IF  (NCODE  .EQ.  1)  GO  TO  1002 

C**  + *****:|=*=K*:(c*»c=)c=|==|c=C„c^c*=(c**=)cXt»c:)c:tc*:t:*******=(c*=(c**=|c=|o|c*****,^**„t**:(t***  + * 

1008  CONTINUE 

IF  (J  .EQ.  1 .OR.  J .EQ.  SCD)  THEN 
IF  (J  .EQ.  1)  THEN 

00CI(IAITN,1)=0.D0 

00CI(IAITN,2)=DEXP(R1(2)) 

0MCI(IAITN,1)=ELL1 

0MCI(IAITN,2)=EUL1 

T0CI(IAITN,1)=ELL3 

T0CI(IAITN,2)=EUL3 

TMCI(IAITN,1)=ELL2 

TMCI(IAITN,2)=EUL2 

ENDIF 

IF  (J  .EQ.  SCD)  THEN 

00CI(IAITN,1)=DEXP(R1(2)) 

OOCIdAITN,  2)  =99999. 99999 

0MCI(IAITN,1)=ELL1 

0MCI(IAITN,2)=EUL1 
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T0CI(IAITN,1)=ELL3 

T0CI(IAITN,2)=EUL3 

TMCI(IAITN,1)=ELL2 

TMCI(IAITN,2)=EUL2 

END  IF 


ELSE 


00CI(IAITN,1)=DEXP(R2(2)) 

00CI(IAITN,2)=DEXP(R1(2)) 

0MCI(IAITN,1)=ELL1 

0MCI(IAITN,2)=EUL1 

T0CI(IAITN,1)=ELL3 

T0CI(IAITN,2)=EUL3 

TMCI(IAITN,1)=ELL2 

TMCI(IAITN,2)=EUL2 


END  IF 


WRITE(45,5100)IAITN,00CI(IAITN,1),00CI(IAITN,2) 
WRITE(46 , 5100) lAITN, OMCI (lAITN, 1) , OMCI (lAITN, 2) 
WRITE(47,5100)IAITN,T0CI(IAITN,1) ,T0CI(IAITN,2) 
WRITE(48,5100)IAITN,TMCI(IAITN,1) , TMCI (IAITn' 2) 
5000  CONTINUE 


c 

c COVERAGE  PROBABILITY 

C 

C 

DALL=-5.51D0 

IST=1 

DO  5200  IAIN=1,1100 

DALL=DALL+0.01D0 

ALL=DEXP(DALL) 

CALL  ITERA 1 0 ( ALPHA , ALL , ELL2 , I ST , PALPHA , HYPD 1 0 ) 

POOCI=O.DO 

POMCI=O.DO 

PTOCI=O.DO 

PTMCI=O.DO 

DO  5210  IAITN=1,I0T0T 

IF  (ALL  .GE.  00CI(IAITN,1)  .AND.  ALL  .LE.  OOCI (lAITN, 2) ) THEN 
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5210 

5200 

5220 

5100 

5300 

1002 

10 

20 


P00CI=P00CI+HYPD10(IAITN, 1) 

END  IF 

IF  (ALL  .GE.  OMCIdAITN,  1)  .AND.  ALL  .LE.  OMCI (lAITN, 2) ) THEN 

P0MCI=P0MCI+HYPD10(IAITN, 1) 

END  IF 

IF  (ALL  .GE.  TOCIdAITN,  1)  .AND.  ALL  .LE.  TOCI (lAITN, 2) ) THEN 
PT0CI=PT0CI+HYPD10(IAITN,1) 

ENDIF 

IF  (ALL  .GE.  TMCI(IAITN,1)  -AND.  ALL  .LE.  TMCI (lAITN, 2) ) THEN 
PTMCI=PTMCI+HYPD10 (lAITN ,1) 

ENDIF 

CONTINUE 

C0VER(IAIN,1)=DALL 
COVER (IAIN, 2) =POOCI 
C0VER(IAIN,3)=P0MCI 
C0VER(IAIN,4)=PT0CI 
COVER (IAIN, 5 )=PTMCI 
CONTINUE 

OPEN (UNIT=50 , FILE= ' COVER . P 0 
DO  5220  IAIN=1,1100 

WRITE(50,5300)C0VER(IAIN,1),C0VER(IAIN,2) ,C0VER(IAIN,3) , 

1 C0VER(IAIN,4),C0VER(IAIN,5) 

CONTINUE 


PRINT*, 'END' 

F0RMAT(I5,2F12.6) 

F0RMAT(5F12.4) 

END 


subroutine  error (i err ,mxs ,mxss ,mxz,mxd) 
iot  = 6 


if  (ierr  .eq.  1)  then 
write(iot, 10)mxs 

format (/lOx, 'Error  : Maximum  no.  of  strata  = ' , i4) 
return 
endif 

if  (ierr  .eq.  2)  then 
write(iot,20)mxz 

f ormat (/ lOx , ' Insuf f icient  memory  : Increase  size  of',/, 
'array  HYP  to  be  more  than  ',i7) 
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30 


40 


return 

endif 

if  (ierr  .eq.  3)  then 
write (iot,30)mxd 

format (/lOx , ^ Insufficient  memory  ; Increase  size  oi’ ,J , 
+ 'array  DS  to  be  more  than  ',i7) 

return 
endif 

if  (ierr  .eq.  4)  then 
write(iot,40)mxss 

format (/lOx, 'Error  : Maximum  stratum  size  = ',i9) 
return 
endif 


return 

end 

C********** 

SUBROUTINE  CNV2X2 ( IK , MXS , MXZ , MXD , LGE , ITAB , HYP , DS , II , K1 , K2 , IERR , 
1 pobsh, JCI,0DR) 


C 

C 

C 

C 


C 

C 

C 


CONVOLVES  HYPERGEOMETRIC  DISTRIBUTIONS 
GENERATED  BY  SEVERAL  2X2  TABLES 

INTEGER  ITAB(MXS,4) 

DOUBLE  PRECISION  HYP (0 :MXZ) ,DS (0 : 1 , 0 :MXD) ,SUMLG 
DOUBLE  PRECISION  DDl ,DD2 , 0NE,ZER0 ,HYMAX,DSMX ,LGE,EL 
DOUBLE  PRECISION  ZLOG,ZEXP,X 

DOUBLE  PRECISION  hypsum(lOOO) ,hypobs (1000) ,pobsh,hypd(l000 ,0 : 2000) 
DOUBLE  PRECISION  DENOl ,P0BSH1 , PEXIMP ,PEX , 0DR,PSI 

integer  infhyl(lOOO) , infhyu(lOOO) 
C0MM0N/CI2/hypd,infhyl,infhyu,P0BSHl,PEXIMP,PEX 

DATA  ONE, ZERO  /l . OD+00 , 0 . OD+00/ 

ZLOG(X)  = DLOG(X) 

ZEXP(X)  = DEXP(X) 

CHECK  INPUT  PARAMETERS 

IF  (IK  .GT.  MXS)  IERR=1 
K1  = 0 
DO  1 1=1, IK 

IMM  = ITAB (1,1)  + ITAB (I, 2) 

INN  = ITAB(I,3)  + ITAB(I,4) 


ITT  = ITAB(I,1)  + ITAB(I,3) 
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C 

C LOWER  AND  UPPER  LIMITS  FOR  STRATUM  DISTRIBUTION 
C 

IF  (ITT  .GT.  INN)  THEN 
ILl  = ITT-INN 
ELSE 

ILl  = 0 
END  IF 

IF  (ITT  .LT.  IMM)  THEN 
IL2  = ITT 
ELSE 

IL2  = IMM 
END  IF 

ITT  = IL2  - ILl 
IF  (ITT  .GT.  MXZ)  lERR  = 2 
K1  = K1  + ILl 
1 CONTINUE 

IF  (lERR  .GT.  0)  RETURN 
C 

DDl  = 10*0NE 
EL  = LGE*ZL0G(DD1) 

C 

C INITIALISE  AND  SET  LOG-SCALE  INDICATOR 
C 

II  = 0 
JJ  = 1 
IR  = 0 

DS(0,0)  = ONE/DEXP(EL) 

DSMX  = ZERO  - EL 
ILS  = 0 
C 

C FOR  STRATA=1, . . . ,IK,  COMPUTE  HYPERGEOMETRIC  DISTRIBUTION 
C AND  PERFORM  CONVOLUTION  IN  A RECURSIVE  FASHION 
C 

DO  13  1=1, IK 

IMM  = ITAB(I,1)  + ITAB(I,2) 

INN  = ITAB(I,3)  + ITAB(I,4) 

ITT  = ITAB(I,1)  + ITAB(I,3) 

C 

C LOWER  AND  UPPER  LIMITS  FOR  CONVOLUTION 
C 

IF  (ITT  .GT.  INN)  THEN 
ILl  = ITT-INN 
ELSE 


ILl  = 0 
END  IF 

IF  (ITT  .LT.  IMM)  THEN 
IL2  = ITT 
ELSE 

IL2  = IMM 
END  IF 

IL2  = IL2  - ILl 
K2  = IR  + IL2 
IF  (K2  .GT.  MXD)  THEN 
lERR  = 3 
RETURN 
END  IF 

COMPUTE  STRATUM  DISTRIBUTION  ON  LOG-SCALE 

HYP(O)  = ZERO 
DO  2 J=1,IL2 

DDl  = DBLE(FL0AT(IMM-J-IL1+1))*DBLE(FL0AT(ITT-J-IL1+D) 
DD2  = DBLE(FL0AT(J+IL1))*DBLE(FL0AT(INN-ITT+J+IL1)) 
HYP(J)  = HYP(J-l)  + ZL0G(DD1/DD2) 
print*,  Mata  ,j  ,HYP(J)  ,zexp(HYP(J))  , j+ILl 

2 CONTINUE 

IF  (ILS  .EQ.  1)  GOTO  9 

GET  MAXIMUM  HYPERGEOMETRIC  COEFFICIENT  ON  LOG-SCALE  AND 
CHECK  FOR  POTENTIAL  OVERFLOW  IN  STRATUM  DISTRIBUTION 

AM  = (1.0  + FL0AT(ITT))/(1.0  + FL0AT(INN+1)/FL0AT(IMM+1)) 
lAM  = IFIX(AM)  - ILl 
IF  (HYP(O)  .GT.  HYP(IL2))  THEN 
DO  3 J=0,IL2 

HYP(J)  = HYP(J)  - HYP(IL2) 

print*. 'j ,hyp(j) ,zexp(HYP(J)) ' , j >hyp(j) ,zexp(HYP(J)) 

3 CONTINUE 
ENDIF 

HYMAX  = HYP (I AM) 

IF  (HYMAX  .GT.  EL)  THEN 
ILS  = 1 

print*, 'ILS=1  in  Cl' 

GOTO  7 
ENDIF 

CHECK  FOR  POTENTIAL  OVERFLOW  IN  THE  ITH  CONVOLUTION 
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IF  (IL2  .LT.  IR)  THEN 
IXX  = IL2  + 1 
ELSE 

IXX  = IR  + 1 
END  IF 

C IXX  = (IL2  + 1)*(IR  + 1) 

DDl  = DBLE(FLOATdXX)) 

DDl  = ZLOG(DDl) 

DSMX  = DSMX  + HYMAX  + DDl 

C WRITE(*,*)I, DSMX, HYMAX, DDl 

IF  (DSMX  .GE.  EL  - ONE)  THEN 
ILS  = 1 

print*, 'ILS=1  in  Cl' 

GOTO  7 
END  IF 
C 

C CONVERT  STRATUM  DISTRIBUTION  TO  NATURAL  SCALE 
C 

hypsum(i)=0 .do 

C PRINT*, 'ODR,JCI' ,ODR,JCI 

DO  4 J=0,IL2 
IF  (JCI  .Eq.  1)  THEN 
HYP(J)  = ZEXP(HYP(J))*0DR**(J+IL1) 

GO  TO  9999 
END  IF 

HYP(J)  = ZEXP(HYP(J)) 

C ihyp(i,j)=j 

9999  liypd(i,  j)=hyp(j) 

c hypd(i,j)  is  hypergeometric  prob  dist  for  each  stratum. 

hyps urn (i)=hyp sum (i)+hyp(j) 

C print*, 'HYP(J)',J,  HYP( J) ,hypsum(i) 

4 CONTINUE 

infhyl(i)=ill 

infhyu(i)=il2 

c infhyu(i)  is  the  no.  of  possible  tables  for  the  fixed  2x2  tables. 

ikim=itab(i , 1) -ILl 
hypobs ( i ) =hyp ( ikim) 

C HYPOBS (I)  IS  THE  NUMERATOR  OF  PROB  FOR  EACH  STRATUM. 

C PRINT* ,' STRATUM  PROB ', I ,hyp (ikim) ,hypsum(i) , hypobs (i)/HYPSUM(I) 

c 

C PERFORM  CONVOLUTION  ON  NATURAL  SCALE 
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C 

DSMX  = ZERO  - EL 
DO  6 J=0,K2 

IF  (J  .GT.  IL2)  THEN 
IHl  = J-IL2 
ELSE 

IHl  = 0 
END  IF 

IF  (J  .LT.  IR)  THEN 
IH2  = J 
ELSE 

IH2  = IR 
ENDIF 

DS(JJ,J)  = DS(II,IH1)*HYP(J-IH1) 

DO  5 JR=IH1+1,IH2 

DDl  = DS(II, JR)*HYP(J-JR) 

DS(JJ,J)  = DS(JJ,J)  + DDl 

5 CONTINUE 

DDl  = ZLOG(DS(JJ,J)) 

IF  (DDl  .GT.  DSMX)  DSMX  = DDl 

6 CONTINUE 
GOTO  12 

7 CONTINUE 
C 

C CONVERT  (I-l)TH  CONVOLVED  DISTRIBUTION  TO  LOG  SCALE 
C 

DO  8 KK=0,IR 

DSdi.KK)  = EL  + ZLOG(DS(II,KK)) 

8 CONTINUE 
C 

C PERFORM  CONVOLUTION  ON  LOGARITHMIC  SCALE 
C 

9 CONTINUE 



C 

if  (ils  .eq.  1)  then 
hypsum(i)=0 .do 

C PRINT* d ODR, JCI' ,ODR,JCI 

DO  4444  J=0,IL2 
IF  (JCI  .EQ.  1)  THEN 
HYP(J)  = ZEXP(HYP(J))*ODR**(J+ILl) 

GO  TO  9998 
ENDIF 


HYP(J)  = ZEXP(HYP(J)) 
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C ihyp(i,j)=j 

9998  hypd(i, j)=hyp(j) 

c hypd(i,j)  is  hypergeometric  prob  dist  for  each  stratum. 


hypsum ( i ) =hypsum ( i ) +hyp ( j ) 

C print*, 'HYP(J)  \J,  HYP ( J)  ,hypsum(i) 

4444  CONTINUE 

infhyl(i)=ill 

infhyu(i)=il2 

c infhyu(i)  is  the  no.  of  possible  tables  for  the  fixed  2x2  tables. 

ikim=itab(i , 1)-IL1 
hypobs ( i ) =hyp ( ikim) 

C HYPOBS (I)  IS  THE  NUMERATOR  OF  PROB  FOR  EACH  STRATUM, 
endif 


DO  11  J=0,K2 

IF  (J  .GT.  IL2)  THEN 
IHl  = J-IL2 
ELSE 

IHl  = 0 
ENDIF 

IF  (J  .LT.  IR)  THEN 
IH2  = J 
ELSE 

IH2  = IR 
ENDIF 

DS(JJ,J)  = DS(II,IH1)  + HYP(J-IHl) 
DO  10  JR=IH1+1,IH2 

DDl  = DS(II,JR)  + HYP(J-JR) 

DD2  = DS(JJ,J) 

DS(JJ,J)  = SUMLG(DD1,DD2) 

10  CONTINUE 

11  CONTINUE 
C 

C RESET  FOR  NEXT  STEP 

C 

12  II  = JJ 

JJ  = 1 - II 
IR  = K2 
C 

C DELETE  THE  NEXT  TWO  STATEMENTS  FROM 

C THE  PUBLISHED  ALGORITHM 

C 
C 


write(*,95)i,ils 
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95  format (/lOx, 'stratum  no.  i3 ,4x scale  =', i2) 

C 

C 

13  CONTINUE 

C ADDED  BY  DONGUK  KIM 
POBSH=l .DO 
DO  20  1=1, IK 

20  POBSH=POBSH*HYPOBS(l) 

P0BSH1=P0BSH 

C POBSH  IS  OBSERVED  VALUE  AND  PROB  IS  POBSH/DENOl 

DEN01=1 .DO 
DO  21  1=1, IK 

21  DEN01=DEN01*HYPSUM(I) 

P0BSH=P0BSH/DEN01 

C PRINT*, 'PROB  OF  OBSERVED  TABLE= ', POBSH 

C PRINT*, 'OBSERVED  VALUE  FOR  PROB= ' , POBSHl 

C 

C NORMALISE  FINAL  DISTRIBUTION 
C 

IF  (ILS  .NE.  1)  THEN 
DO  14  1=0, K2 

DS(II,I)  = ZLOG(DS(II,D) 

14  CONTINUE 
ENDIF 

DSMX  = DS(II,0) 

DO  15  1=1, K2 

DDl  = DS(II,I) 

DD2  = SUMLG (DSMX, DDl) 

DSMX  = DD2 

15  CONTINUE 

DO  16  1=0, K2 

DS(II,I)  = DS(II,I)  - DSMX 

16  CONTINUE 

K2  = K1  + K2 


c Added  by  DONGUK  KIM 
C 

C CALCULATES  OBSERVED  POSITION  IN  SAMPLE  SPACE 
C J - POSITION 

C SCD  - SIZE  COND.  SAMPLE  SPACE 
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ISUMA=0 
DO  180  1=1, IK 

ISUMA  = ISUMA  + ITAB(I,1) 

180  CONTINUE 

J = ISUMA  - K1  + 1 

SCD  = K2  - K1  + 1 

IF  (JCI  .EQ.  1)  GO  TO  999 

JCI0=0 

C IF  (JCI  .EQ.  1)  GO  TO  999 

CALL  IMPROV ( ik , it ab , hypd , inf hyl , inf hyu , POBSHl , PEXIMP , PEX , JCI 0 , PSI ) 

999  RETURN 
END 


C ADDED  BY  DONGUK  KIM  DEC.l,  1992 

C ENUMERATE  ALL  POSSIBLE  TABLES  WITH  GIVEN  2x2  MARGINS,  AND 
C COMPUTE  IMPROVED  EXACT  AND  EXACT  UPPER  AND  LOWER  TAIL  PROBABILITY. 

SUBROUTINE  IMPROV ( ik , it ab , hypd , inf hyl , inf hyu , POBSHl , PEXIMP 
1 PEX, JCIO,PSI) 

PARAMETER (MAXT=270000) 

DOUBLE  PRECISION  HYPD (1000 , 0 : 2000) ,HYPSUM(1000) ,HYPD1 (270000 , 1) 
DOUBLE  PRECISION  DENO ,TS , POBSHl , PTOBSl ,PT0BS2 , PEXIMP , PEX 
DOUBLE  PRECISION  PEXIML,PEXIMU,PEXL,PEXU,HYPD2(270000,20) 

DOUBLE  PRECISION  PEXPl ,PEXP2 ,X,PDN1 ,PDN2 
DOUBLE  PRECISION  PVDIS (270000 , 5) , PDT(500 , 3) ,PSI 

DOUBLE  PRECISION  CHI (270000) ,CHI1 (270000) ,CMH,CHIOBS ,G2 (270000) 
DOUBLE  PRECISION  G , GOBS , CUP ,FIT( 1000 , 2 , 2) 

C PDT(500,3) :Pr(P-value<=x)  for  two  P-values. 

INTEGER  ITAB(1000,4) ,INFHYL(1000) , INFHYU(IOOO) 

INTEGER  INUM(270000,20) ,INUM1(270000,1) ,IOTOT 

INTEGER  NIK(2,1000) ,NJK(2,1000) ,NT0T(1000) , MATRIX (1000, 2, 2) 


LOGICAL  ISEA 


155 


COMMON  /DKIM/  DENO , ITOT, ISUML, INUM,HYPD2 , INUMl ,HYPD1 
c COMMON  /DKIMl/  POBSHl , ISUMTS , IK 
COMMON  /DKIMl/  ISUMTS 
COMMON  /CH/  NIK,NJK,NTOT 
COMMON  /ART/  lOTOT 
COMMON  /CHI/  CHI,CHIOBS 


C HYPDl  HAS  PROB  HAVING  T .GE.  T_OBS. 

C INUMl  HAS  VALUE  IN  EACH  STRATUM  HAVING  T .GE.  T_OBS . 

C THE  MAXIMUM  NO  OF  TABLES  FOR  T .GE.  T_OBS  IS  ALLOWED  TO  BE  5500. 
C IF  IT  IS  GREATER  THAN  5500,  MAXT  IN  HYPDl (MAXT , 1000)  AND 
C INUMl (MAXT, 1000)  SHOULD  BE  INCREASED. 

DEN0=1 .DO 

ISUML=0 

ISUMU=0 

DO  100  1=1, IK 

HYPSUM(I)=O.DO 

IJ=INFHYU(I) 

C print*,  a,ILl,IL2,HYPSUM(I)  M,INFHYL(I)  ,INFHYU(I)  ,HYPSUM(I) 
C IJ  IS  IL2  FOR  EACH  STRATUM 
DO  110  J=0,IJ 

110  HYPSUM(I)=HYPSUM(I)+HYPD(I,J) 

I SUML=I SUML+INFHYL ( I ) 

ISUMU=ISUMU+INFHYU(I) 

DENO=DENO*HYPSUM ( I ) 

C DENO  IS  DENOMINATOR  OF  THE  HYPERGEO.  PROB.  DIST. 

100  CONTINUE 

C PRINT*, ^DENO=' , DENO 

ISUMA=0 
DO  120  1=1, IK 
ISUMA=ISUMA+ITAB (I , 1) 

120  CONTINUE 

I SUMTS = I SUMA- 1 SUML 

C PRINT*, 'I SUMA, I SUMTS' , ISUMA , ISUMTS 

C ISUMTS  IS  OBSERVED  TEST  STATISTICS  WHICH  WILL  BE  USED. 

IT0T=1 

DO  130  1=1, IK 
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IT0T= ITOT* ( INFHYU ( I ) + 1 ) 

IOTOT=ITOT 

C PRINT*, 'NO.  OF  POSSIBLE  TABLES  = \ITOT 

C ITOT  IS  TOTAL  NUMBER  OF  ENUMERATION  FOR  THE  TABLES. 



IF  (ISEA)  GO  TO  315 
C 

C SET  ISEA  FOR  THE  SUBSEQUENT  CALLS 
C 

ISEA=.TRUE. 

C TO  MAKE  INUM 

IC0UNT=0 

NUM=1 

DO  210  K1=1,IK-1 
DO  220  K2=K1+1,IK 
220  NUM=NUM* (INFHYU (K2)+l) 

C NUM  IS  NUMBER  OF  REPLICATES. 

K=K1 

IN=INFHYU(K) 

99  DO  230  K3=1,NUM 
IC0UNT=IC0UNT+1 
INUM(ICOUNT,K)=IN 
230  CONTINUE 

IF  (IN  .GT.  0)  THEN 
IN=IN-1 
GO  TO  99 
END  IF 

IF  (ICOUNT  .LT.  ITOT)  THEN 
IN=INFHYU(K) 

GO  TO  99 
ELSE 
IC0UNT=0 
NUM=1 
ENDIF 

210  CONTINUE 


C THE  LAST  STRATUM  (i.e.,  LAST  COLUMN  IN  ARRAY) 
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IC0UNT=0 

K=IK 

IN=INFHYU(K) 

250  IC0UNT=IC0UNT+1 

INUM(ICDUNT,K)=IN 

IF  (IN  .GT.  0)  THEN 
IN=IN-1 
GO  TO  250 
END  IF 

IF  (ICOUNT  .LT.  ITOT)  THEN 
IN=INFHYU(K) 

GO  TO  250 
END  IF 

C PRINT* , ' ICOUNT= ' , ICOUNT 

C PRINT* ITOT= ^ , ITOT 

DO  300  1=1, ITOT 
ITS=0 

DO  310  J=1,IK 
310  ITS=ITS+INUM(I, J) 

INUM(I,IK+1)=ITS 
300  CONTINUE 
315  CONTINUE 



c PRINT*, 'DISPLAY  ALL  POSSIBLE  TABLES  (Y=1,N=0)  ?' 

C READ(*,*)IDIS 

IDIS=0 

IF  (IDIS  .EQ.l)  THEN 

PRINT*, 'INPUT  NO.  OF  INCREMENT  :' 

C READ(*,*)INCR 

INCR=1 

PRINT* , ' ENUMERATION ' 

DO  320  1=1, ITOT, INCR 

PRINT* , I , ( INUM ( I , J ) +INFHYL (J),J=1,IK), INUM ( I , IK+ 1 ) +I SUML 
320  CONTINUE 

END  IF 

C PRINT* 

C COMPUTATION  OF  PROB  OF  ALL  RANDOM  TABLES 
DO  350  K=1,IK 
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DO  355  I=l,ITOT 
IA=INUM(I,K) 

HYPD2(I,K)=HYPD(K,IA) 

355  CONTINUE 
350  CONTINUE 

DO  360  I=1,IT0T 
TS=1.D0 
DO  365  J=1,IK 
365  TS=TS*HYPD2(I, J) 

HYPD2(I,IK+1)=TS 
360  CONTINUE 

C PRINT  PROB  OF  ALL  RANDOM  TABLE 
TS=O.DO 

DO  367  I=1,IT0T 
TS=TS+HYPD2(I,IK+1) 

367  CONTINUE 

C PRINT*, 'SUM  OF  VALUE,  DENO,  PROB= ' ,TS , DENO ,TS/DENO 

C PRINT* 

C PRINT*, 'PROB  OF  ALL  RANDOM  TABLES  (Y=1,N=0)  ?' 

C READ(*,*)IDSA 

IDSA=0 

IF  (IDSA  .EQ.  1)  THEN 

PRINT*, 'ENUMERATION  OF  PROB  OF  ALL  RANDOM  TABLES' 

DO  370  I=1,IT0T,INCR 

PRINT*,I, (sngI(HYPD2(I, J)) ,J=1,IK) ,sngl(HYPD2(I,IK+l)/DEN0) 

370  CONTINUE 

ENDIF 

C print* 



c 

C GENERATE  ALL  POSSIBLE  RANDOM  TABLES 
C 



C WRITE(*, 70010) 

C70010  FORMAT(/, 'PRINT  X~2  AND  G~2  FDR  ALL  RANDOM  TABLES  ? (Y=1,N=0)') 
C READ(*,*)IX2G2 

IX2G2=0 

C PRINT*, 'IX2G2=0' 
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NR0W=2 

NC0L=2 

NSTM=IK 

CUP=O.DO 


C IPF  IS  CALLED  JUST  ONE  TIME  WITHIN  FIXED  OR. 

IF  (JCIO  .NE.  0)  THEN 
DO  375  K=1,IK 
MATRIX (K,1,1)=ITAB(K,1) 

MATRIX (K , 1 , 2) =ITAB (K , 2) 

MATRIX(K,2,1)=ITAB(K,3) 

MATRIX (K , 2 , 2) =ITAB (K , 4) 

375  CONTINUE 

CALL  IPF(PSI, IK, MATRIX, FIT) 

END  IF 

DO  384  I=1,IT0T 
DO  386  K=1,IK 

MATRIX (K , 1 , 1) =INUM ( I , K) +INFHYL (K) 

MATRIX (K , 1 , 2) =NIK ( 1 , K) -MATRIX (K ,1,1) 

MATRIX (K , 2 , 1 ) =N JK ( 1 , K) -MATRIX (K ,1,1) 

MATRIX (K , 2 , 2) =NTOT (K) - (MATRIX (K ,1,1) +MATRIX (K , 1 , 2) +MATRIX (K , 2 , 1 ) ) 
IF  (IX2G2  .EQ.  1)  THEN 

WRITE (*, 70000) I , K , MATRIX (K , 1 , 1 ) , MATRIX (K , 1 , 2) , MATRIX (K , 2 , 1) , 

1 MATRIX (K, 2, 2) 

C WRITE(*,70001)K,NIK(1,K) ,NIK(2,K) ,NJK(1,K) ,NJK(2,K) ,NTOT(K) 

70000  F0RMAT(2I5,'  : ',4110) 

70001  FORMATC 'TOTAL' ,6110) 

END  IF 

386  CONTINUE 

C IF  (JCIO  .NE.  0)  CALL  IPF(PSI , IK , MATRIX , FIT) 

CALL  CMHNN 1 (NROW , NCOL , NSTM , NIK , N JK , NTOT , MATRIX , CMH , G , 

1 JCIO, FIT) 

CHI(I)=CMH 

G2(I)=G 

CUP=CUP+HYPD2 (I , IK+1) /DENO 
IF  (IX2G2  .EQ.  1)  THEN 

WRITE(*, 70002)1, HYPD2(I,IK+1)/DEN0,CHI(I) ,G2(I) ,CUP 

70002  FORMAT('NO.,  Pr(T),  X~2,  G~2  = ' , 15 ,4F14 . 7 , /) 
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END  IF 

384  CONTINUE 

C COMPUTE  THE  OBSERVED  CHI-SQUARED  STATISTIC  : CHIOBS 
IF  (IX2G2  .EQ.  1)  THEN 
PRINT* 

PRINT* OBSERVED  DATA' 

END  IF 

DO  390  K=1,IK 
MATRIX (K,1,1)=ITAB(K,1) 

MATRIX (K , 1 , 2) =ITAB (K , 2) 

MATRIX (K , 2 , 1) =ITAB (K ,3) 

MATRIX (K , 2 , 2) =ITAB (K , 4) 

IF  (IX2G2  .EQ.  1)  THEN 

WRITE (* , 70004) K , ITAB (K , 1 ) , ITAB (K , 2) , ITAB (K , 3) , 

1 ITAB (K, 4) 

70004  F0RMAT(5X, 15, ' ; ',4110) 

C WRITE(*,70001)K,NIK(1,K) ,NIK(2,K) ,NJK(l,K) ,NJK(2,K) ,NTOT(K) 
END  IF 

390  CONTINUE 

CALL  CMHNN 1 (NROW , NCOL , NSTM , NIK , N JK , NTOT , MATRIX , CMH , G , 

1 JCIO,FIT) 

CHIOBS=CMH 

GOBS=G 

IF  (IX2G2  .EQ.l)  THEN 
WRITE (*, 70003) POBSHl/DENO, CHIOBS, GOBS 
70003  FORMAT (' OBSERVED  Pr(t_o),  X~2,  G"2  = ' , 3F14 . 7 , / , /) 

ENDIF 



395  IF  (JCIO  .EQ.  3)  GO  TO  1000 



C UPPER  TAIL 
C TO  MAKE  INUMl 

C KIM_9.F 

IF  (JCIO  .EQ.  1)  GO  TO  666 

IC0UNT1=0 

DO  400  I=1,IT0T 
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IF  (INUM(I,IK+1)  .GE.  ISUMTS)  THEN 
IC0UNT1=IC0UNT1+1 
INUMl (ICOUNTl , 1) =INUM(I , IK+1) 
HYPDl (ICOUNTl , 1) =HYPD2(I , IK+1) 
CHI1(IC0UNT1)=CHI(I) 


IF  (ICOUNTl  .GE.  MAXT)  THEN 

PRINT* INCREASE  ARRAY  INUMl, HYPDl  IN  SUBROUTINE  IMPRIV' 
print*, 'icount,icountl' , icount , icountl 
go  to  1000 
END  IF 

END  IF 

400  CONTINUE 

IF  (JCIO  .EQ.  0)  THEN 

C PRINT*, 'NO  OF  TABLES  FOR  T .GE.  T.OBS  =MC0UNT1 

END  IF 

C PRINT* 

C PRINT*, 'DISPLAY  ALL  TABLES  FOR  T .GE.  T_OBS  (Y=1,N=0)  ?' 

C READ(*,*)IDIS1 

IDIS1=0 

IF  (IDISl  .EQ.  1)  THEN 

c PRINT*, 'INPUT  NO.  OF  INCREMENT  :' 

c READ(*,*)INCR1 

INCR1=1 


PRINT*, 'ENUMERATION  FOR  THOSE  TABLES  HAVING  T .GE.  T.OBS' 

DO  420  I=1,IC0UNT1,INCR1 
PRINT* , I , INUMl (I , 1) +ISUML 
420  CONTINUE 

ENDIF 

C PRINT* 

C COMPUTATION  OF  PROB  OF  OBSERVING  OBSERVED  AND  RANDOM  TABLES 

C PRINT*, 'PROB  FOR  THOSE  TABLES  HAVING  T .GE.  T OBS  (Y=1,N=0)  ' 

C READ(*,*)IDS2 

IDS2=0 


IF  (IDS2  .EQ.  1)  THEN 

PRINT* ENUMERATION  OF  PROB  FOR  THOSE  TABLES  HAVING  T 
DO  550  1=1, ICOUNTl, INCH 
PRINT* , I , sngl (HYPD 1 ( I , 1 ) /DENO ) 

550  CONTINUE 

END  IF 

C print* 

PT0BS1=0.D0 

C PRINT*, 'DISPLAY  UPPER  TAIL  IMPROV.  PROB  (Y=1,N=0)  ?' 

C READ(*,*)IDS3 

IDS3=0 

IF  (IDS3  .EQ.  1)  WRITE(*,888) 
c WRITE (*,888) 

DO  560  I=1,IC0UNT1 

IF  (INUM1(I,1)  .GT.  ISUMTS  .OR.  INUM1(I,1)  .EQ.  ISUMTS 

1 .AND.  CHIl(I)  .GE.  CHIOBS)  THEN 

C 1 .AND.  HYPD1(I,1)  .LE.  POBSHl)  THEN 
PT0BS1=PT0BS1+HYPD1(I,1) 

if  (IDS3  .NE.  1)  GO  TO  555 

WRITE (* , 900) INUMl (I , 1) +ISUML , ISUMTS+ISUML , 

2 HYPDl (I , 1) /DENO , POBSHl/DENO , PTOBS 1/DENO 

900  F0RMAT(2(1X,I7) ,3(1X,F12.6)) 

555  ENDIF 
560  CONTINUE 
c ENDIF 

C print* 

PEXIMU=PT0BS1/DEN0 

C PEXIMPU  IS  IMPROVED  UPPER  TAIL  EXACT  PROB. 

PT0BS2=0.D0 

C PRINT*, 'DISPLAY  UPPER  TAIL  PROB  (Y=1,N=0)  ?' 

C READ(*,*)IDS4 

IDS4=0 

IF  (IDS4  .EQ.  1)  WRITE(*,889) 

C WRITE(*,889) 

DO  570  I=1,IC0UNT1 
IF  (INUMl (1,1)  .GE.  ISUMTS)  THEN 
PT0BS2=PT0BS2+HYPD1 (I , 1) 

IF  (IDS4  .NE.  1)  GO  TO  565 

WRITE (* , 900) INUMl (I , 1) +ISUML , ISUMTS+ISUML , 
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3 HYPDl (I , 1) /DENO , POBSHl/DENO , PT0BS2/DEN0 
C PRINT* , INUMl (I , 1) , ISUMTS , SNGL (HYPDl (I , 1) /DENO) , 

C 3 SNGL (POBSHl/DENO) , SNGL (PT0BS2/DEN0) 

565  ENDIF 
570  CONTINUE 
C ENDIF 

C print* 

PEXU=PT0BS2/DEN0 

C PEXU  IS  UPPER  TAIL  EXACT  PROB. 

C PRINT* , ' IMPROVED  P.EXACT  = ' , PEXIMP 

C PRINT*,'  P.EXACT  =',PEX 

c PRINT*, 'PROB.  OF  OBSERVED  TABLES  =', POBSHl/DENO 

C WRITE(*,901)PEXIMU 

C WRITE(*,902)PEXU 

C PRINT* 

C PRINT*, 'PROB.  OF  OBSERVED  TABLES  =', POBSHl/DENO 

C PRINT* 

IF  (JCIO  .EQ.  2)  THEN 
PEXIMP=PEXIMU 
PEX=PEXU 
GO  TO  1000 
ENDIF 

888  FORMAT('  T T_obs  Pr(Ta)  pr(Ta_obs)  P(T>T_obs+ 

lorder) ' ) 

889  FORMAT('  T T_obs  Pr(Ta)  pr(Ta_obs)  P(T>=T  obs) 

2') 

901  FORMAT(' IMPROVED  UPPER  P.EXACT  =',3X,F12.6) 

902  FORMAT('  UPPER  P.EXACT  =',3x!f12.6) 




C LOWER  TAIL 
C TO  MAKE  INUMl 
C KIM.9.F 

666  IC0UNT1=0 

DO  600  I=1,IT0T 

IF  (INUM(I,IK+1)  .LE.  ISUMTS)  THEN 
IC0UNT1=IC0UNT1+1 


INUMl (ICOUNTl , 1)=INUM(I , IK+1) 
HYPDl (ICOUNTl , 1)=HYPD2(I , IK+1) 
CHI1(IC0UNT1)=CHI(I) 
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IF  (ICOUNTl  .GE.  MAXT)  THEN 

PRINT*, INCREASE  ARRAY  INUMl, HYPDl  IN  SUBROUTINE  IMPRIV' 
print* , ' icount , icountl ' , icount , icountl 
go  to  1000 
ENDIF 

END  IF 

600  CONTINUE 

IF  (JCIO  .EQ.  0)  THEN 

C PRINT*, 'NO  OF  TABLES  FOR  T .LE.  T_0BS  =',IC0UNT1 

ENDIF 

C PRINT* 

C PRINT*, 'DISPLAY  ALL  TABLES  FOR  T .LE.  T_0BS  (Y=1,N=0)  ?' 

C READ(*,*)IDIS1 

IDIS1=0 

IF  (IDISl  .EQ.  1)  THEN 

c PRINT*, 'INPUT  NO.  OF  INCREMENT  :' 

c READ(*,*)INCR1 

INCR1=1 

PRINT*, 'ENUMERATION  FOR  THOSE  TABLES  HAVING  T .LE.  T_OBS' 

DO  620  I=1,IC0UNT1,INCR1 
PRINT* , I , INUMl (I , 1) +ISUML 
620  CONTINUE 

ENDIF 

C PRINT* 

C COMPUTATION  OF  PROB  OF  OBSERVING  OBSERVED  AND  RANDOM  TABLES 

C PRINT*, 'PROB  FOR  THOSE  TABLES  HAVING  T .LE.  T_OBS  (Y=1,N=0)  ?' 

C READ(*,*)IDS2 

IDS2=0 

IF  (IDS2  .EQ.  1)  THEN 

PRINT*, 'ENUMERATION  OF  PROB  FOR  THOSE  TABLES  HAVING  T .LE.  T OBS' 
DO  750  I=1,IC0UNT1,INCR 
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PRINT* , I , sngl (HYPO 1 ( I , 1 ) /DENO ) 

750  CONTINUE 

END  IF 

C print* 

PT0BS1=0.D0 

C PRINT*, 'DISPLAY  LOWER  TAIL  IMPROV.  PROB  (Y=1,N=0)  ?' 

C READ(*,*)IDS3 

IDS3=0 

IF  (IDS3  .EQ.  1)  WRITE(*,890) 
c WRITE(*,890) 

DO  760  I=1,IC0UNT1 

IF  (INUM1(I,1)  .LT.  ISUMTS  .OR.  INUM1(I,1)  .EQ.  ISUMTS 

1 .AND.  CHIl(I)  .GE.  CHIOBS)  THEN 

C 1 .AND.  HYPD1(I,1)  .LE.  POBSHl)  THEN 

PT0BS1=PT0BS1+HYPD1(I,1) 

if  (IDS3  .NE.  1)  GO  TO  755 

WRITE (*  , 900)  INUMl  (I , D+ISUML , ISUMTS+ISUML , 

2 HYPDl (I , 1) /DENO , POBSHl /DENO ,PT0BS1/DEN0 
755  ENDIF 

760  CONTINUE 
c ENDIF 

C print* 

PEXIML=PT0BS1/DEN0 

C PEXIMPL  IS  IMPROVED  LOWER  TAIL  EXACT  PROB. 

PT0BS2=0.D0 

C PRINT*, 'DISPLAY  LOWER  TAIL  PROB  (Y=1,N=0)  ?' 

C READ(*,*)IDS4 

IDS4=0 

IF  (IDS4  .EQ.  1)  WRITE(*,891) 

C WRITE(*,891) 

DO  770  I=1,IC0UNT1 
IF  (INUMl (1,1)  .LE.  ISUMTS)  THEN 
PT0BS2=PT0BS2+HYPD1 (1,1) 

IF  (IDS4  .NE.  1)  GO  TO  765 

WRITE (* , 900)  INUMl  (I , D+ISUML , ISUMTS+ISUML , 

3  HYPDl (I , 1) /DENO ,P0BSH1/DEN0 ,PT0BS2/DEN0 
C PRINT* , INUMl (I , 1) , ISUMTS , SNGL (HYPDl (I , 1) /DENO) , 

^ 3 SNGL (POBSHl/DENO) , SNGL (PT0BS2/DEN0) 

765  ENDIF 
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770  CONTINUE 
C ENDIF 

C print* 

PEXL=PT0BS2/DEN0 

C PEXL  IS  LOWER  TAIL  EXACT  PROB . 

C PRINT* /IMPROVED  LOWER  P.EXACT  =\PEXIML 

C PRINT*/  LOWER  P.EXACT  =',PEXL 

c PRINT*/PROB.  OF  OBSERVED  TABLES  = /POBSHl/DENO 

C WRITE (*, 903) PEXIML 

C WRITE (*,904) PEXL 

C PRINT* 

C PRINT* PROB . OF  OBSERVED  TABLES  =' ,P0BSH1/DEN0 

C PRINT* 

IF  (JCIO  .EQ.  1)  THEN 
PEXIMP=PEXIML 
PEX=PEXL 
GO  TO  1000 

ENDIF 


890 

FORMAT (' 
lorder) ' ) 

T 

T_obs 

Pr(Ta)  pr(Ta_obs) 

P(T<T_obs+ 

891 

FORMAT (' 
2') 

T 

T_obs 

Pr(Ta)  pr(Ta_obs) 

P(T<=T_obs) 

903 

FORMAT(' IMPROVED 

LOWER  P.EXACl 

■ =' ,3X,F12.6) 

904 
c 

FORMAT (' 

LOWER 

P.EXACT 

=' ,3X,F12.6) 

IF  (PEXIML  .GT.  PEXIMU)  PEXIML=PEXIMU 
PEXIMP=PEXIML 

IF  (PEXL  .GT.  PEXU)  PEXL=PEXU 
PEX=PEXL 

C PRINT* 

C PRINT*,' 

C WRITE (*, 906) PEXIMP 

C WRITE (*, 907) PEX 

C PRINT* 

C PRINT*, 'PROB.  OF  OBSERVED  TABLES  =' ,P0BSH1/DEN0 
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C PRINT*,' 

C PRINT* 

906  FORMAT ('IMPROVED  P.EXACT  =',3X,F12.6) 

C906  FORMAT('  MID  P P_EXACT  =',3X,F12.6) 

907  FORMATC'  P.EXACT  =',3X,F12.6) 

C WRITE(*,1200) 

1200  FORMAT ('EXPECTED  VALUE  OF  P.VALUES  (Y=1,N=0)  ?') 
C READ(*,*)IEP 

IEP=0 

IF  (lEP  .NE.  1)  GO  TO  1000 
1215  WRITE(*,1210) 

1210  FORMAT (/, 'ENTER  CODE  : 1.  LOWER  TAIL  ONLY ',/ , 

1 ' 2.  UPPER  TAIL  ONLY') 

C READ(*,*)IUL 



C SELECT  LOWERTAIL  OR  UPPER  TAIL 
IUL=1 




PRINT*, 'CODE  =' ,IUL 
PRINT* 

IF  (lUL  .NE.  1 .AND.  lUL  .NE.  2)  GO  TO  1215 

PEXP1=0.D0 

PEXP2=0.D0 

DO  1220  I=1,IT0T 


INO=I 

ISUMTS=INUM(I,IK+1) 

P0BSH1=HYPD2(I,IK+1) 

PRINT*, 'NO.  T =' ,ISUMTS+ISUML, ' Pr(Ta)  = ' , POBSHl/DENO 

CALL  MEANP (INO , IK ,HYPD , ISUMTS , POBSHl , lUL , PEXIMP ,PEX) 
PEXP1=PEXP1+PEXIMP*P0BSH1/DEN0 
PEXP2=PEXP2+PEX*P0BSH1/DEN0 

PVDIS(I,1)=DBLE(IN0) 

PVDIS(I,2)=DBLE(ISUMTS+ISUML) 

PVDIS(I,3)=P0BSH1 

PVDIS(I,4)=PEXIMP 
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PVDIS(I,5)=PEX 
1220  CONTINUE 

PRINT* 

PRINT* , ' 

WRITE(*,1230)  PEXPl 
WRITE (*,1240)  PEXP2 

PRINT*,  ' 

PRINT* 

PRINT* 

OPEN (UNIT=31 ,FILE='mean . out ’ ) 


WRITE(31,1250)  PEXPl, PEXP2 

1250  F0RMAT( 'E_P_ improved  = ' ,F12 . 6 , 5x, ' E_P_ordinary  = \F12.6) 

1230  FORMAT('MEAN  OF  IMPROVED  EXACT  P_VALUES  =',F12.6) 

1240  FORMATC'MEAN  OF  STANDARD  EXACT  P.VALUES  =',F12.6) 

WRITE(*, 1300) 

1300  FORMAT('THE  CDF  OF  TWO  P-VALUES  (Y=1,N=0)  ?>,/, 

1 ' i.e.,  Pr(P-value  <=  x ,0<x<l)  ?’) 

PRINT* 

C READ(*,*)IDP 
IDP=0 

PRINT*, 'CDF=0' 

PRINT* 

IF  (IDP  .NE.  1)  GO  TO  1000 

DO  1310  IX=1,500 

X=DBLE(IX)/500.D0 

PDN1=0.D0 

PDN2=0.D0 

DO  1320  I=1,IT0T 

IF  (PVDIS(I,4)  .LE.  X)  PDN1=PDN1+PVDIS(I,3) 

IF  (PVDIS(I,5)  .LE.  X)  PDN2=PDN2+PVDIS(I,3) 

1320  CONTINUE 

PDT(IX,1)=X 
PDT(IX,2)=PDN1 
PDT(IX,3)=PDN2 
1310  CONTINUE 

C PDT(IX,2)/DEN0  FOR  IMPROVED  AND  PDT(IX,3)/DEN0  FOR  ORDINARY 
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C IS  Pr(P-value<=x)  for  x=PDT(IX,l). 


OPEN (UNIT=32 , FILE= ' pcdf .out  0 
DO  1330  IX=1,500 

WRITE (32 ,1340)  PDT ( IX , 1 ) , PDT (IX , 2) /DENO , PDT ( IX , 3) /DENO 
1330  CONTINUE 

1340  F0RMAT(F12.6,1X,F12.6,1X,F12.6) 

1000  RETURN 
END 

C********=(c**=)c*^c=(c=t==t=*=)=***^c*=t:**=)<>(c,|c**:*,,c=|<*=l<Hc****,^:(c*:+=****:jc*****=)c*:<.**,^,^*:+:**** 
C FROM  ALL  POSSIBLE  TABLES  WITH  GIVEN  2x2  MARGINS 
C COMPUTE  ALL  POSSIBLE  STANDARD  AND  IMPROVED  EXACT 
C UPPER  OR  LOWER  TAIL  PROBABILITY. 

C INUM,  HYPD2  IS  FOR  ALL  ENUMERATION. 

C INUMl,  HYPDl  IS  FOR  UPPER  OR  LOWER  TAIL. 

C234567 

SUBROUTINE  MEANP ( INO , IK , HYPD , ISUMTS , POBSHl , lUL , PEXIMP , PEX) 
PARAMETER(MAXT=270000) 

DOUBLE  PRECISION  HYPD (1000 , 0 : 2000) ,HYPD2 (270000 ,20) 

DOUBLE  PRECISION  HYPDl (270000 , 1) 

DOUBLE  PRECISION  DENO ,TS , POBSHl ,PT0BS1 ,PT0BS2 , PEXIMP , PEX 
DOUBLE  PRECISION  PEXIML,PEXIMU,PEXL,PEXU 
INTEGER  INUM(270000, 20) , INUMl (270000,1) 

COMMON  /DKIM/  DENO , ITOT, ISUML, INUM, HYPD2 , INUMl , HYPDl 


OPEN (UNIT=30 , FILE= ' pdis . out ' ) 


C WRITE(30, 100)  INO 

100  FORMAT('NO.  =',I10) 

C WRITE(30,101)  ISUMTS+ISUML,P0BSH1/DEN0 

101  FORMATOT  =' ,I10,5x, 'Pr(Table)  =',F15.10) 

C COMPUTATION  OF  LOWER  TAIL  P_VALUE 
IF  (lUL  .EQ.  1)  GO  TO  598 
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C UPPER  TAIL 

C TO  MAKE  INUMl  : ALL  POSSIBLE  TABLES  SUCH  THAT  T>=t_obs 
C COMPUTATION  OF  PROB  OF  OBSERVING  RANDOM  TABLES 
C TO  MAKE  HYPDl  : THE  VALUE  OF  PROB  FOR  INUMl 
C PROB  OF  THE  TABLE  IS  HYPDl (I , 1) /DENO 

IC0UNT1=0 
DO  400  I=1,IT0T 

IF  (INUM(I,IK+1)  .GE.  ISUMTS)  THEN 
IC0UNT1=IC0UNT1+1 
C DO  410  J=1,IK+1 

INUMl (ICOUNTl , 1)=INUM(I , IK+1) 

HYPDl (ICOUNTl , 1) =HYPD2 (I , IK+1) 

C410  CONTINUE 

END  IF 

400  CONTINUE 

C INUMl  HAS  T HYPDl  HAS  Pr(Table) . 

C WRITE(30,102)  ICOUNTl 

102  FORMATC'NO  OF  TABLES  FOR  T .GE.  T_OBS  =',I10) 


PT0BS1=0.D0 

PT0BS2=0.D0 

DO  560  1=1, ICOUNTl 
PT0BS2=PT0BS2+HYPD1 (I , 1) 

IF  (INUMl (1,1)  .GT.  ISUMTS  .OR.  INUMl (1,1)  .EQ.  ISUMTS 
1 .AND.  HYPDl (1,1)  .LE.  POBSHl)  THEN 
PT0BS1=PT0BS1+HYPD1 (1,1) 

555  ENDIF 
560  CONTINUE 


PEXIMU=PT0BS1/DEN0 

C PEXIMPU  IS  IMPROVED  UPPER  TAIL  EXACT  PROB. 
PEXU=PT0BS2/DEN0 

C PEXU  IS  UPPER  TAIL  EXACT  PROB. 

WRITE (30 ,903) INO , ISUMTS+ISUML , POBSHl /DENO , PEXIMU , PEXU 
C WRITE(30,901)PEXIMU 
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C WRITE (30, 902) PEXU 

C PRINT* 

C PRINT*, 'PROB.  OF  OBSERVED  TABLES  = ' , POBSHl/DENO 
C PRINT* 


PEXIMP=PEXIMU 
PEX=PEXU 
GO  TO  1000 

888 

FORMAT ('  T 

lorder) ’ ) 

T_obs 

Pr(Ta)  pr(Ta_obs) 

P (T>T_obs+ 

889 

FORMAT ('  T 

2') 

T_obs 

Pr(Ta)  pr(Ta_obs) 

P(T>=T_obs) 

901 

FORMAT ('IMPROVED 

UPPER  P.EXACT 

=' ,3X,F12.6) 

902 
c 

FORMAT ('  UPPER 

P.EXACT 

=',3X,F12.6,/) 

c— 

C LOWER  TAIL 

C TO  MAKE  INUMl  : ALL  POSSIBLE  TABLES  SUCH  THAT  T<=t_obs 
C COMPUTATION  OF  PROB  OF  OBSERVING  RANDOM  TABLES 
C TO  MAKE  HYPDl  : THE  VALUE  OF  PROB  FOR  INUMl 
C PROB  OF  THE  TABLE  IS  HYPDl (I , 1) /DENO 

598  IC0UNT1=0 

DO  600  I=1,IT0T 

IF  (INUM(I,IK+1)  .LE.  ISUMTS)  THEN 
IC0UNT1=IC0UNT1+1 
C DO  610  J=1,IK+1 

INUMl (ICOUNTl , 1)=INUM(I , IK+1) 

HYPDl (ICOUNTl , 1)=HYPD2(I , IK+1) 

C610  CONTINUE 
END  IF 

600  CONTINUE 
C WRITE (30, 103) ICOUNTl 

103  FORMAT('NO  OF  TABLES  FOR  T .LE.  T_OBS  =',I10) 


PT0BS1=0.D0 

PT0BS2=0.D0 
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DD  760  I=1,IC0UNT1 
PT0BS2=PT0BS2+HYPD1 (1 , 1) 

IF  (INUM1(I,1)  .LT.  ISUMTS  .OR.  INUM1(I,1)  .EQ.  ISUMTS 
1 .AND.  HYPD1(I,1)  .LE.  POBSHl)  THEN 
PT0BS1=PT0BS1+HYPD1 (I , 1) 

755  ENDIF 
760  CONTINUE 

PEXIML=PT0BS1/DEN0 

C PEXIMPL  IS  IMPROVED  LOWER  TAIL  EXACT  PROB . 

PEXL=PT0BS2/DEN0 

C PEXL  IS  LOWER  TAIL  EXACT  PROB. 

WRITE (30 , 903) INO , ISUMTS+ISUML , POBSHl/DENO , PEXIML , PEXL 

C WRITE(30, 903) PEXIML 

C WRITE (30, 904) PEXL 

C PRINT* 

C PRINT*, ^PROB.  OF  OBSERVED  TABLES  = \P0BSH1/DEN0 

C PRINT* 

PEXIMP=PEXIML 

PEX=PEXL 

890  FORMAT('  T T_obs  Pr(Ta)  pr(Ta_obs)  P(T<T_obs+ 

lorder) ’ ) 

891  FORMAT('  T T_obs  Pr(Ta)  pr(Ta_obs)  P(T<=T  obs) 

2>) 

903  F0RMAT(I8,I7,1X,F15.10,1X,F12.6,F12.6) 

904  FORMAT('  LOWER  P.EXACT  = ' , 3X ,F12 . 6 , /) 


1000  RETURN 
END 


C FROM  KARIM  MAY  90  SLR. FOR  ********** 

DOUBLE  PRECISION  FUNCTION  SUMLG (DDDl ,DDD2) 


DOUBLE  PRECISION  DDD ,DD1 ,DD2 ,DDD1 ,DDD2 
DOUBLE  PRECISION  ZLOG,ZEXP,X 
C 

ZLOG(X)  = DLOG(X) 

ZEXP(X)  = DEXP(X) 

C 

DD1=DDD1 

DD2=DDD2 

C PRINT  *, 'HELLO  FROM  WITHIN  SUMLG' 

C 

DDD  = DDl 

IF  (DD2  .GT.  DDl)  DDD=DD2 
DDl  = ZEXP(DDl-DDD) 

DD2  = ZEXP(DD2-DDD) 

DDD  = ZL0G(DD1+DD2)  + DDD 

SUMLG  = DDD 

RETURN 

END 

C******=t:>l'** 

C DIFFERENT  SUMLG  IN  junE  91 
c DOUBLE  PRECISION  FUNCTION  SUMLG(DD1 ,DD2) 

c DOUBLE  PRECISION  DDD,DD1,DD2 

c DOUBLE  PRECISION  ZLOG,ZEXP,X 

cC 

c ZLOG(X)  = DLOG(X) 

c ZEXP(X)  = DEXP(X) 

cC 

c DDD  = DDl 

c IF  (DD2  .GT.  DDl)  DDD=DD2 

c DDl  = ZEXP(DDl-DDD) 

c DD2  = ZEXP(DD2-DDD) 

c DDD  = ZL0G(DD1+DD2)  + DDD 

c SUMLG  = DDD 

c RETURN 

c END 

C * * >(t  :(£  * * * *>)c  * * + j|c  ^ ^ 5j.  3)c  ^ ^ ^ ^ ^ ^ ^ ^ ^ 

DOUBLE  PRECISION  FUNCTION  FLOWER(BETA) 

C 

C CALCULATES  P(T=T)*FF  + P(T<T) 

C 

DOUBLE  PRECISION  BETA, A(5500) , SLA, SLB,SLU,UUU, SUMLG, C(5500)  K SLO 
+ ,FF 

INTEGER  J,SCD 
COMMON/PARAM/C , J , SCD , K , FF 
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C 

DO  30  1=1, SCD 

A(I)=C(I)+(I-1)*BETA 
30  CONTINUE 
C 

SLA=A(1) 

DO  40  1=2, J-1 

SLA=SUMLG(SLA,A(D) 

40  CONTINUE 

SLO=A(J) 

SLB=A(J+1) 

DO  50  I=J+2,SCD 

SLB=SUMLG(SLB,A(D) 

50  CONTINUE 

UUU=SUMLG(SLA,SLB) 

SLA=SUMLG (SLA , SLO+DLOG (FF) ) 

UUU=SUMLG(UUU,SLO) 

C 

SLU  = DEXP(SLA  - UUU) 

SLU  = SLU  - K 
FLOWER  = SLU 
RETURN 
END 

C *********=(c*******>|<*j)c*:(c:(c**>|c:(c:(c*:(c**=(c*:t:* 

C ********  ************  **  Jt:  j(t  * :f:  ^ 3(c  ^ ^ ^ ^ ^ 

DOUBLE  PRECISION  FUNCTION  FLOWBO(BETA) 

C 

C CALCULATES  P(T=T)*FF  WHEN  T IS  ON  LOWER  BOUNDARY 
C 

DOUBLE  PRECISION  BETA, A(5500) , SLA, SLB, SLU, UUU, SUMLG,C(5500) ,K,SLO 
+ ,FF 

INTEGER  J,SCD 
COMMON/PARAM/C , J , SCD , K , FF 
C 

DO  30  1=1, SCD 

A(I)=C(I)+(I-1)*BETA 
30  CONTINUE 
C 

SL0=A(1) 

SLB=A(J+1) 

DO  50  I=J+2,SCD 

SLB=SUMLG(SLB,A(D) 

50  CONTINUE 


UUU=SUMLG(SLB,SLO) 

SLO=SLD+DLOG(FF) 
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C 

SLU  = DEXP(SLO  - UUU) 

SLU  = SLU  - K 
FLOWED  = SLU 
RETURN 
END 

C *********=)<:*=l==|c**  + :(c*=(c********  + ***>|c**** 

C * * * :)c  * :^c  * * * % :(c  * >(c  =)c  * :(c  * 3(c  * 3)c  :)c  * 3tc  * * >|t  * % * * * J)c  ♦ * * :)t  * :(c  :(<  :(t  )|(  * 

DOUBLE  PRECISION  FUNCTION  FUPPER(BETA) 

C 

C CALCULATES  P(T=T)*FF  + P(T>T) 

C 

DOUBLE  PRECISION  BETA, A(5500) , SLA, SLB, SLU, UUU, SUMLG,C(5500) ,K,SLO 
+ ,FF 

INTEGER  J,SCD 
COMMON/PARAM/C , J , SCD , K , FF 

DO  30  1=1, SCD 

A(I)=C(I)+(I-1)*BETA 
30  CONTINUE 
C 

SLA=A(1) 

DO  40  1=2, J-1 

SLA=SUMLG(SLA,A(D) 

40  CONTINUE 

SLO=A(J) 

SLB=A(J+1) 

DO  50  I=J+2,SCD 

SLB=SUMLG(SLB,A(D) 

50  CONTINUE 

UUU=SUMLG(SLA,SLB) 

SLB=SUMLG (SLB , SLO+DLOG (FF) ) 

UUU=SUMLG(UUU,SLO) 

SLU  = DEXP(SLB  - UUU) 

SLU  = SLU  - K 
FUPPER  = SLU 
RETURN 
END 

C ************=(C**=K***:(C*>)C*****=|C*>(C****>(C* 
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DOUBLE  PRECISION  FUNCTION  FUPPBO(BETA) 

C 

C CALCULATES  P(T=T)*FF  WHEN  T IS  ON  UPPER  BOUNDARY 
C 

DOUBLE  PRECISION  BETA, A(5500) , SLA, SLB,SLU,UUU,SUMLG,C(5500) ,K,SLO 
+ ,FF 

INTEGER  J,SCD 
COMMON/PARAM/C , J , SCD , K , FF 
C 

DO  30  1=1, SCD 

A(I)=C(I)+(I-1)*BETA 
30  CONTINUE 
C 

SLA=A(1) 

DO  40  1=2, J-1 

SLA=SUMLG(SLA,A(D) 

40  CONTINUE 

SLO=A(J) 

UUU=SUMLG(SLA,SLO) 

SLO=SLO+DLOG(FF) 

C 

SLU  = DEXP(SLO  - UUU) 

SLU  = SLU  - K 
FUPPBO  = SLU 
RETURN 
END 

C ******=«*****=tc=|c=(c**:)c***j(c*)(cs(c*^>(c**!)c***=(c* 

SUBROUTINE  brent (XO , t ol , IMAX , zbrent , Func , ITER) 
c FUNCTION  ZBRENT(FUNC,X1,X2,T0L) 
c Van  Wijngaarden-Dekker-Brent  method 

c in  Press  WH,  Flannery  BP,  Teukolsky  SA,  Vetterling  WT: 
c Numerical  Recipes  - The  Art  of  Scientific  Computing 
c (Fortran  version).  Cajnbridge:  Cambridge  University  Press,  1989 
c code  on  pages  253-254. 
c 

c Using  Brent's  method,  find  the  root  of  a function  FUNC  known  to 
c lie  between  XI  and  X2.  The  root  returned  as  ZBRENT  will  be  refined 
c until  its  accuracy  is  TOL. 

c (EPS  is  machine  floating  point  precision,  see  p 16) 
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c eps  changed  - declarations  + delxO  introduced 
c 

PARAMETER(ITMAX=100,EPS=1 .d-14) 

double  precision  delxO,func,tol,zbrent ,a,b,c,d,e,fa,fb,fc 
double  precision  p , q, r , s , xm,xO ,toll 
external  func 
delxO= . 2d+00 
1 A=XO-delxO 
B=XO+delxO 
FA=FUNC(A) 

FB=FUNC(B) 

IF(FB*FA.GT.O)then 
C print  'BRACKET  ROOTl | ' 

delx0=delx0*2 
goto  1 

endif 

“ no  modifications  below  this  line 

FC=FB 

DO  11  ITER=1,ITMAX 

IF(FB*FC.GT.O)THEN 

C=A 

FC=FA 

D=B-A 

E=D 

ENDIF 

IF ( ABS (FC) . LT . ABS (FB) ) THEN 
A=B 
B=C 
C=A 
FA=FB 
FB=FC 
FC=FA 
ENDIF 

T0L1=2 . *EPS*ABS (B) +0 . 5*T0L 
XM=.5*(C-B) 

IF(ABS(XM) .LE.TOLl  .OR.  FB . EQ . 0 . )THEN 
ZBRENT=B 
RETURN 
ENDIF 

IF(ABS(E) .GE.TOLl  .AND.  ABS (FA) . GT . ABS (FB) )THEN 
S=FB/FA 

IF(A.EQ.C)THEN 

P=2.*XM*S 

Q=l.-S 

ELSE 
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Q=FA/FC 

R=FB/FC 

P=S* (2 . *XM*q* (Q-R) - (B-A) * (R- 1 . ) ) 
Q=(Q-1.)*(R-1.)*(S-1.) 

END  IF 

IF(P.GT.0.)  Q=-q 
P=ABS(P) 

IF(2.*P  .LT.  MIN(3.*XM*q-ABS(T0Ll*q) ,ABS(E*q)))THEN 
E=D 
D=p/q 
ELSE 
D=XM 
E=D 
END  IF 

ELSE 

d=xm 

e=d 

endif 

A=B 

FA=FB 

IF(ABS(D)  .GT.  TOLl)  THEN 
B=B+D 
ELSE 

B=B+SIGN(TOLl,XM) 

ENDIF 

FB=FUNC(B) 

11  CONTINUE 

PAUSE  'MAX  IT' 

ZBRENT=B 

RETURN 

end 

C * * :(t  * j)t  % 


SUBROUTINE  brent  1 (XO , t ol , IMAX , zbrent , ITER , JCI , JCIO , PALPHA) 

c FUNCTION  ZBRENT(FUNC,X1,X2,T0L) 
c Van  Wijngaarden-Dekker-Brent  method 

c in  Press  WH,  Flannery  BP,  Teukolsky  SA,  Vetterling  WT: 
c Numerical  Recipes  - The  Art  of  Scientific  Computing 
c (Fortran  version).  Cambridge:  Cambridge  University  Press,  1989 
c code  on  pages  253-254. 
c 

c Using  Brent's  method,  find  the  root  of  a function  FUNC  known  to 
c lie  between  XI  and  X2 . The  root  returned  as  ZBRENT  will  be  refined 


until  its  accuracy  is  TOL. 

(EPS  is  machine  floating  point  precision,  see  p 16) 


eps  changed  - declarations  + delxO  introduced 

IMPLICIT  REAL*8  (A-H,0-Z) 

PARAMETER (ITMAX=1 00, EPS=l.d- 14) 

double  precision  delxO,tol,zbrent,a,b,c,d,e,fa,fb,fc 
double  precision  delxO ,f unc ,tol ,zbrent , a,b , c, d ,e ,f a,f b ,f c 
double  precision  p, q,r,s,xm,x0, toll, 0R1,0R2,FA1, FBI 

integer  itab (1000 ,4) , inf hyl (1000) , infhyu(lOOO) 
double  precision  hyp(0 : 2000) ,ds(0 : 1 ,0 : 5500) ,lge ,P0BSH 
DOUBLE  PRECISION  hypd(1000 , 0 : 2000) , POBSHl ,PEXIMP,PEX 
INTEGER  J,SCD 

DOUBLE  PRECISION  CA(5500) ,K ,FF 


COMMON/ Cl 1 / ik , mxs , mxz , mxd , Ige , it  ab , hyp , ds , ipar , kl , k2 , i err , pobsh 
COMMON/ CI2/hypd , inf hyl , inf hyu , POBSHl , PEXIMP , PEX 
COMMON/PARAM/CA , J , SCD , K , FF 


external  func 
delxO=l . 2d0 
0R1=X0 

IF  (XO-delxO  .LE.  O.DO)  THEN 
0Rl=0Rl/2.d0 
ELSE 

0Rl=X0-delx0 
END  IF 

0R2=X0+delx0 

A=0R1 

B=0R2 

PRINT* 

PRINT*, 'ODDS  RATIO  0R1,0R2=  ',0R1,0R2 

call  cnv2x2(ik, mxs, mxz, mxd, Ige, itab, hyp, ds, ipar, kl,k2,ierr, pobsh, 

1 jci,0Rl) 

CALL  IMPROV ( ik , it  ab , hypd , inf hyl , inf hyu , POBSHl , PEXIMP , PEX , JCI 0 , ORl ) 

FA=PEXIMP-K 

FA1=PEX-K 

call  cnv2x2 (ik ,mxs ,mxz ,mxd , Ige , itab , hyp , ds , ipar ,kl , k2 , i err , pobsh , 

1 jci,0R2) 

CALL  IMPROV ( ik , it  ab , hypd , inf hyl , inf hyu , POBSH 1 , PEX IMP , PEX , JCI 0 , 0R2 ) 


FB=PEXIMP-K 

FB1=PEX-K 

IF (FB*FA . GT . 0)then 

print  'BRACKET  ROOT  I | ' 

PRINT*.'K,X0,delx0,0Rl,0R2,FA,FB',K,X0,delx0,0Rl,0R2,FA,FB 

delx0=delx0*2 
goto  1 

endif 

no  modifications  below  this  line 

FC=FB 

DO  11  ITER=1,ITMAX 

IF(FB*FC.GT.O)THEN 

C=A 

FC=FA 

D=B-A 

E=D 

ENDIF 

IF(ABS(FC) .LT.ABS(FB))THEN 
A=B 
B=C 
C=A 
FA=FB 
FB=FC 
FC=FA 
ENDIF 

T0L1=2 . *EPS*ABS (B) +0 . 5*T0L 
XM=.5*(C-B) 

IF(ABS(XM) .LE.TOLl  .OR.  FB . EQ . 0 . )THEN 
ZBRENT=B 

0R2=B 

PRINT*, 'ODDS  RATIO  0R2  ',0R2 

call  cnv2x2 ( ik , mxs , mxz , mxd , Ige , it ab , hyp , ds , ipar , kl , k2 , ierr , pobsh , 

1 jci,0R2) 

CALL  IMPROV ( ik , it ab , hypd , inf hyl , inf hyu , POBSHl , PEXIMP , PEX , JCI 0 , 0R2 ) 

PALPHA=PEXIMP 
PRINT* , ' PEXIMP= ' , PEXIMP 
PRINT*, 'FIRST  TETURN  ZBRENT=B',B 

RETURN 

ENDIF 
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IF(ABS(E) .GE.TQLl  .AND.  ABS (FA) . GT . ABS (FB) )THEN 
S=FB/FA 

IF(A.EQ.C)THEN 

P=2.*XM*S 

Q=l.-S 

ELSE 

Q=FA/FC 

R=FB/FC 

P=S* (2 . (Q-R) - (B-A) * (R- 1 . ) ) 
Q=(Q-1.)*(R-1.)*(S-1.) 


IF(P.GT.O.)  Q=-Q 
P=ABS(P) 

IF(2.*P  .LT.  MIN(3.*XM*Q-ABS(T0Ll*q) ,ABS(E*q)))THEN 
E=D 
D=p/q 
ELSE 
D=XM 
E=D 
END  IF 

ELSE 

d=xm 

e=d 

endif 

A=B 

FA=FB 

IF(ABS(D)  .GT.  TOLD  THEN 
B=B+D 
ELSE 

B=B+SIGN(TOLl,XM) 

ENDIF 


1 jci,DR2) 

CALL  IMPRDV ( ik , it ab , hypd , inf hyl , inf hyu , PDBSHl , PEXIMP , PEX , JCIG , GR2) 

FB=PEXIMP-K 

FB1=PEX-K 


ENDIF 


C 


FB=FUNC(B) 


C 


DR2=B 

PRINT*,  ^ ODDS  RATIO  0R2  \0R2 


11 


CONTINUE 
PAUSE  'MAX  IT' 

ZBRENT=B 

PRINT*, 'SECOND  RETURN  ZBRENT=B',B 


C 
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RETURN 

end 


SUBROUTINE  SATO ( ITAB , IK , LL , UL , MH , KA , lA , RLL , RUL , IPOS , VRBG) 

C 

C 

C CALCULATES  THE  LIMITS  OF  EQUATION  (2)  IN 

C SATO,  T.  (1990).  CONFIDENCE  LINITS  FOR  THE  COMMON  ODDS  RATIO 
C BASED  ON  THE  ASYMPTOTIC  DISTRIBUTION  OF  THE  MANTEL-HAENSZEL 
C HAENSZEL  ESTIMATOR.  BIOMETRICS,  46,  71-80. 

C 

C 

INTEGER  ITAB(1000,4) ,IA,IPOS 

DOUBLE  PRECISION  IN , IM, INN,R,S ,P , Q , W ,RK, SK, SQ ,LL,UL , CHI2 ,MH 
DOUBLE  PRECISION  KA ( 100 , 2) , SVDl , SVD2 ,SVD3 , VRBG ,RUL,RLL 
DATA  W/O.DOO/,RK  /O.DOO/,SK  /O . DOO/ , SVDl/0 . DOO/ , SVD2/0 . DOO/ 
DATA  SVD3/0.D00/ 

CHI2  = KA(IA,1) 

C PRINT  *,CHI2 

C ADDED  BY  DONGUK  KIM,  OCT.  3,  1993 

C THIS  IS  REQUIRED  FOR  THE  ITERATION  OF  RANDOM  TABLES. 

C SET  TO  ZERO. 

W=O.DO 

RK=O.DO 

SK=O.DO 

SVD1=0.D0 

SVD2=0.D0 

SVD3=0.D0 

DO  100  1=1, IK 

IT  = ITAB(I,1)  + ITAB(I,2) 

IN  = ITAB(I,1)  + ITAB(I,3) 

IM  = ITAB(I,2)  + ITAB(I,4) 

INN  = IN  + IM 

R = ITAB(I,1)*ITAB(I,4)/INN 
S = ITAB(I,2)*ITAB(I,3)/INN 
P = (ITAB(I,1)  + ITAB(I,4))/INN 
Q = (ITAB(I,2)  + ITAB(I,3))/INN 
W = W + (Q  + 1/INN)*R  + (P  + 1/INN)*S 
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RK  = RK  + R 
SK  = SK  + S 

C 

c ROBINS  J,  BRESLOW  NE,  GREENLAND  S.  ESTIMATORS  OF 

C THE  MANTEL-HAENSZEL  VARIANCE  CONSISTENT  IN  BOTH 

C SPARSE  DATA  AND  LARGE-STRATA  LIMITING  MODELS 

C BIOMETRICS  1986;42:311-23. 

C 

C VARIANCE 

SVDl  = SVDl  + P*R 

SVD2  = SVD2  + (Q*R  + P*S) 

SVD3  = SVD3  + Q*S 




100  CONTINUE 

c ggQ  limits  (CONT) 

IF  (IPOS  .GE.  DTHEN 
RUL=999 
RLL=999 
GOTO  109 

END  IF 

VRBG=SVDl/2/RK/RK  + SVD2/2/RK/SK  + SVD3/2/SK/SK 
RLL  = DEXP(DL0G(RK/SK)-SQRT(CHI2*VRBG)) 

RUL  = DEXP(DL0G(RK/SK)+SQRT(CHI2*VRBG)) 

C PRINT  RLL, RUL 



109  SQ  = SQRT((4*RK*SK  + CHI2*W) *CHI2*W) 

IF(SK  .EQ.  0.0)  GOTO  110 

LL  = (2*RK*SK  + CHI2*W  - SQ)/2/SK/SK 
UL  = (2*RK*SK  + CHI2*W  + Sq)/2/SK/SK 
MH  = RK/SK 
C PRINT  RK/SK 

C PRINT  *,LL,UL 

GOTO  120 

110  LL  = (2*RK*SK  + CHI2*W  - SQ)/2/RK/RK 
UL  = (2*RK*SK  + CHI2*W  + SQ)/2/RK/RK 
LL  = 1/UL 

C PRINT  *, INFINITE  POINT  ESTIMATE  - LOWER  LIMIT  ONLY^ 

C PRINT  *,LL 

120  RETURN 
END 


C234567 
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SUBROUTINE  IT2 (ALPHA , INOUT, CAl , CA2 , PT , ATO , SUM, lOOTO) 

C 

C INOUT=l  IF  t_obs  IS  IN  THE  GIVEN  PROBABILITY  DISTRIBUTION  WITH 
C PROBABILITY  1-ALPHA,  ELSE  0 

IMPLICIT  REALMS  (A-H,0-Z) 
c PARAMETER(EPS=l.d-6) 

PARAMETER (EPS=1 . d-14) 

integer  itab(1000 ,4) , infhyl (1000) , infhyu(lOOO) 
double  precision  hyp(0 : 2000) ,ds(0 : 1 , 0 : 5500) ,lge ,POBSH 
DOUBLE  PRECISION  hypd(lOOO , 0 : 2000) , POBSHl ,PEXIMP,PEX 
INTEGER  J,SCD 

DOUBLE  PRECISION  CA(5500) ,K,FF,CA1(5500) ,CA2(5500) ,INC(5500) 
DOUBLE  PRECISION  CAS (5500) 

COMMON/ Cl 1/ ik , mxs , mxz , mxd , Ige , it ab , hyp , ds , ipar , kl , k2 , i err , pobsh 

C0MM0N/CI2/hypd, infhyl, inf hyu, POBSHl, PEXIMP,PEX 
COMMON/PARAM/CA , J , SCD , K , FF 

DO  10  I=1,K2-K1+1 
10  CA3(I)=CA1(I) 

C APPLYING  MODIFIED  P 
IN0UT=1 
SUM=O.DO 

CALL  SHELL(K2-K1+1,CA2) 

DO  100  I=1,K2-K1+1 
DO  110  I1=1,K2-K1+1 

IF  (CA2(I)  .EQ.  CASdD)  THEN 
INC(I)=I1 

C FOR  OTHER  T THAT  HAS  THE  SAME  PROB. 

CA3(I1)=0.D0 
GO  TO  100 
END  IF 

110  CONTINUE 

cl05  PRINT*, 'CA2(I),INC(I)=  ' ,CA2(I) ,INC(I) 

100  CONTINUE 

C 

C FOR  TWO-SIDED  LIMITS  ADD  TERMS  FROM  SMALLEST  PROB 
C IN  ASCENDING  ORDER  OF  SIZE  (NOT  FROM  EITHER  TAIL) . 

C 

IP=1 

150  SUM=SUM+CA2(IP) 

IIK=INC(IP) 
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IF  (lOOTO  .EQ.  2)  THEN 

C TWO  SIDED-MODIFIED  P (MODIFIED  STERNE-TYPE  P) 

IF  (IIK  .EQ.  J)  SUM=SUM-ATO 
END  IF 

CAl(IIK)=O.DO 

C PRINT*, 'SUM, IIK=  ', SUM, IIK 

IF  (SUM  .GE.  ALPHA)  THEN 

C PRINT*, 'SUM, IIK, INOUT=  ' , SUM, IIK , INOUT 

RETURN 
ENDIF 

C IF  (INC(IP)  .EQ.  J)  THEN 

IF  (IP  .EQ.  K2-K1+1  .OR.  CA2(IP+1)  .GT.  PT)  THEN 
IN0UT=0 

C PRINT*, 'SUM, IIK, INOUT=  ', SUM, IIK, INOUT 

RETURN 
ENDIF 
IP=IP+1 
GO  TO  150 
END 


C234567 

SUBROUTINE  ITERA (ALPHA , START , RHO 1 , ist , PALPHA , lOOTO) 

C GIVEN  ALPHA,  STARTING  VALUE,  ITERA  ITERATES  AND  RETURNS  RHO  A LOWER  LIMIT. 

IMPLICIT  REAL*8  (A-H,0-Z) 

PARAMETER(EPS=1 . d-14) 

PARAMETER (NNI T= 1 0 0 0 ) 
c PARAMETER(EPS=1 . d-6) 

integer  itab (1000,4), inf hyl ( 1000) , inf hyu ( 1000) 
double  precision  hyp (0 : 2000) ,ds (0 : 1 , 0 : 5500) , Ige ,POBSH,PSI 
DOUBLE  PRECISION  hypd(l000,0 :2000) ,P0BSH1 ,PEXIMP,PEX 
DOUBLE  PRECISION  SRT(NNIT,2) ,SRTS(NNIT) ,SRTSR(NNIT,2) 

DOUBLE  PRECISION  SRTl (NNIT, 2) , SRTSl (NNIT) , SRTSRl (NNIT, 2) 

INTEGER  J,SCD 

DOUBLE  PRECISION  CA(5500) ,K,FF,CA1(5500) ,CA2(5500) ,CC(5500) 

COMMON/CI 1/ik , mxs ,mxz , mxd , Ige , it ab , hyp , ds , ipar , kl , k2 , ierr , pobsh 
COMMON/ CI2/hypd , inf hyl , inf hyu , POBSHl , PEXIMP , PEX 
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COMMON/PARAM/CA , J , SCD , K , FF 

COMMDN/ITN/SRT , SRTS , SRTSR , SRTl , SRTS 1 , SRTSRl 

C IF  (1ST  .EQ.  1)  THEN 

C OPEN (UNIT=35 ,FILE= ' st_lo_ci . out ' ) 

C OPEN (UNIT=39 ,FILE= ' st_lo_all . out  0 

C ELSE 

C OPEN (UNIT=36 ,FILE= ' st_up_ci . out ’ ) 

C OPEN (UNIT=40 ,FILE= ‘ st_up_all . out ’ ) 

C ENDIF 


DO  5 JJI=1,NNIT 
SRT(JJI,l)=O.DO 
SRT(JJI,2)=0.D0 
SRTS(JJI)=O.DO 
SRTSR(JJI,l)=O.DO 
SRTSR(JJI,2)=0.D0 
SRTl(JJI,l)=O.DO 
SRT1(JJI,2)=0.D0 
SRTSl(JJI)=O.DO 
SRTSRl(JJI,l)=O.DO 
5 SRTSR1(JJI,2)=0.D0 


PSI=START 

RHO=O.DO 

KL=0 

ITE=0 

ITE1=0 


C FOR  STERNE'S  Cl,  JCI=1  SHOULD  BE  ASSIGNED  FOR  ODR  COMPUTATION, 

C WHENEVER  WE  CALL  CNV2X2. 

JCI=1 

10  call  cnv2x2(ik,mxs,mxz,mxd,lge,itab,hyp,ds,ipar,kl,k2,ierr,pobsh, 

1 jci,PSI) 

JCI0=3 

CALL  IMPROV (ik , itab ,hypd , inf hyl , inf hyu , POBSHl , PEXIMP , 

1 PEX, JCIO,PSI) 

C COMPUTE  MODIFIED  EXACT  ALTERNATIVE  PROB  DISTN. 

CALL  COMPT(CC,ATO,PT,IOOTO) 
c print* , ' 


=' ,psi 
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c CAl(I)  : PROBABILITY 
DO  90  I=1,K2-K1+1 
CA1(I)=CC(I) 

90  CONTINUE 

c CA2(I)  : DUPLICATE  OF  CAl  AND 

C THIS  WILL  BE  SORTED  PROBABILITY  IN  ASCENDING  ORDER  AFTER  ITl . 
DO  95  I=1,K2-K1+1 
CA2(I)=CC(I) 

95  CONTINUE 


C IN=INOUT (ALPHA) 

C CALL  ITl (ALPHA, INOUT, CAl) 

CALL  IT2 (ALPHA , INOUT , CA 1 , CA2 , PT , ATO , SUM , I OOTO ) 

IN=INOUT 

C 

C KL  IS  0 UNTIL  CORRECT  VALUE  IS  SPANNED  BY  RHO  AND  OPSI, 
C THEN  KL  IS  SET  TO  1 . 

C 

C IN=1  IF  PSI  IS  TOO  LARGE,  ELSE  IN=0 . 

C 

C ATO  IS  INCLUDED  IN  ACCEPTANCE  REGION. 

IF  (CAl(J)  .EQ.  O.DO)  THEN 
CA1(J)=AT0 
ELSE 

CA1(J)=CA1(J)+AT0 
END  IF 


PCHK=O.DO 

DO  100  I=1,K2-K1+1 
PCHK=PCHK+CA1(I) 

C IF  (I  .EQ.  K2-K1+1)  print*, i, CAl(I) ,pchk 

100  continue 

C PRINT* , ' PSI , TWO-SIDED  P-VALUE  = ' , PSI , SUM 

C PRINT*, 'P.ACCEPT,  TOTAL  P = ' , PCHK , SUM+PCHK 

ITE1=ITE1+1 

SRT1(ITE1,1)=PSI 

SRT1(ITE1,2)=SUM 

IF  (ITEl  .GT.  1000)  THEN 

PRINT*, 'NOT  CONVERGE  IN  TWO-SIDED  P' 
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PALPHA=-99999 . 99999 
RH01=-99999. 99999 
GO  TO  99 

END  IF 


C SUM  IS  THE  TWO-SIDED  P_VALUE. 

IF  (IN  .EQ.  0)  THEN 
ITE=ITE+1 
SRT(ITE,1)=PSI 
SRT(ITE,2)=SUM 

ENDIF 

IF  (KL  .EQ.  1)  GO  TO  40 
IF  (IN  .EQ.  1)  GO  TO  20 
RHO=PSI 

c PSI=PSI*1 . IDO 

if  (ist  .eq.  1)  PSI=PSI*1 . OIDO 
if  (ist  .eq.  2)  PSI=PSI*0 . 99D0 

GO  TO  10 
20  KL=1 

OPSI=PSI 

30  PSI=(RH0+0PSI)*0.5D0 

C 

C NEW  ESTIMATE  IS  MIDPOINT  OF  SPANNING  INTERVAL 
GO  TO  10 

40  IF  (IN  .EQ.  1)  OPSI=PSI 

IF  (IN  .NE.  1)  RHO=PSI 

C IF  (DABS(RHO/OPSI  -l.DO)  .LT.  EPS)  RETURN 

IF  (DABS(RHO/OPSI  -l.DO)  .LT.  EPS)  THEN 

IF  (ITEl  .GT.  NNIT)  THEN 

PRINT*, 'INCREASE  NNIT  FOR  ARRAYS  SRT,SRTS' 
GO  TO  99 
ELSE 

C PRINT*, 'NO  OF  ITERATION  = ’,ITE,ITE1 

C PRINT* 

ENDIF 

DO  102  JJI=1,ITE 
102  SRTS(JJI)=SRT(JJI,2) 

CALL  SHELL1(ITE,SRTS) 
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DO  200  1=1, ITE 
DO  210  11=1, ITE 

IF  (SETS (I)  .EQ.  SRT(I1,2))  THEN 
SRTSR(I,1)=SRT(I1,1) 
SRTSR(I,2)=SRT(I1,2) 

GO  TO  200 

END  IF 

210  CONTINUE 
200  CONTINUE 

C IF  (1ST  .EQ.l)  THEN 

C DO  105  JJI=ITE,1,-1 

C105  WRITE(35,107)  J JI , SRTSR( J JI , 1) , SRTSR( J JI , 2) 

C ELSE 

C DO  106  JJI=ITE,1,-1 

C106  WRITE(36,107)  J JI , SRTSR( J JI , 1) , SRTSR( J JI , 2) 

C ENDIF 

107  F0RMAT(I10,2F20.15) 

C PALPHA=SRTSR(ITE,2) 

C RH01=SRTSR(ITE,1) 

C PRINT*, 'FINAL  LIMIT (RHO)  = ' ,RHO 


C SORTING  BY  THETA 

DO  300  JJI=1,ITE1 
300  SRTS1(JJI)=SRT1(JJI,1) 

CALL  SHELL 1(ITE1,SRTS1) 

C FOR  THE  LOWER  LIMIT  P_VALUE  IS  SAVED  IN  ASCENDING  ORDER. 
IF  (1ST  .EQ.  1)  THEN 
DO  310  I=1,ITE1 
DO  320  11=1, ITEl 

IF  (SRTSl(I)  .EQ.  SRT1(I1,1))  THEN 
SRTSR1(I,1)=SRT1(I1,1) 
SRTSR1(I,2)=SRT1(I1,2) 

SRT1(I1,1)=0.D0 

C FOR  THE  SAKE  OF  THE  SAME  THETA. 

GO  TO  310 

ENDIF 

320  CONTINUE 
310  CONTINUE 

C FOR  THE  UPPER  LIMIT  P.VALUE  IS  SAVED  IN  ASCENDING  ORDER. 
C THAT  IS,  THETA  IS  SAVED  IN  DESCENDING  ORDER. 
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ELSE 

DO  330  I=1,ITE1 
DO  340  I1=ITE1,1,-1 

IF  (SRTSl(I)  .EQ.  SRTldl,!))  THEN 
SRTSRl (ITEl-I+1 , 1)=SRT1 (II , 1) 

SRTSR1(ITE1-I+1,2)=SRT1(I1,2) 

SRTldl, 1)=0. DO 

C FOR  THE  SAKE  OF  THE  SAME  THETA. 

GO  TO  330 

END  IF 

340  CONTINUE 
330  CONTINUE 
END  IF 

DO  350  JJI=1,ITE1 
IF  (SRTSRl (JJI,1)  .EQ.  RHO)  THEN 
ITE2=JJI 
GO  TO  360 

END  IF 

350  CONTINUE 
360  CONTINUE 

C360  PRINT*, 'LIMIT (RHO)  (ITE2)  =',ITE2 

ITE3=ITE2 

DO  400  JJI=ITE2,1,-1 

400  IF  (SRTSRl (JJI, 2)  .GT.  ALPHA)  ITE3=JJI 

IF  (ITE3  .NE.  ITE2)  THEN 
ITE4=ITE3-1 

ELSE 

ITE4=ITE3 

ENDIF 

C PRINT* , ' ITE3 , ITE4  = ' , ITE3 , ITE4 

C FIND  THE  MAXIMUM  P_VALUE  WHICH  CAN  NOT  EXCEED  ALPHA/2  AND  ITS  THETA. 
TMAX=SRTSR1(ITE4,1) 

PRMAX=SRTSR1 (ITE4 , 2) 

PALPHA=PRMAX 

RH01=TMAX 

C PRINT*, 'CLOSER  ALPHA= ' , SRTSR(ITE, 1) ,SRTSR(ITE, 2) 

C PRINT*, 'LIMIT (RHO)  = ' ,TMAX,PRMAX 

C IF  (1ST  .EQ.l)  THEN 

C DO  420  JJI=ITE1,1,-1 
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C420  WRITE(39,107)  J JI , SRTSRl ( JJI , 1) , SRTSRl ( J JI , 2) 

C ELSE 

C DO  430  JJI=ITE1,1,-1 

C430  WRITE(40,107)  J JI , SRTSRl (JJI , 1) , SRTSRl (JJI , 2) 

C END IF 


99  RETURN 
END  IF 

GO  TO  30 
END 


C234567 

SUBROUTINE  COMPT(CC,ATO,PT,IOOTO) 

IMPLICIT  REALMS  (A-H,0-Z) 

PARAMETER(EPS=1 . d-14) 

integer  itab(1000,4) ,infhyl(lOOO) ,infhyu(1000) 

INTEGER  INUM(270000,20) ,INUM1(270000,1) 

double  precision  hyp(0 : 2000) ,ds(0 : 1 , 0: 5500) ,lge,POBSH 
DOUBLE  PRECISION  liypd(lOOO , 0 : 2000) , POBSHl ,PEXIMP ,PEX 
DOUBLE  PRECISION  HYPDl (270000 , 1) ,HYPD2 (270000 , 20) 
DOUBLE  PRECISION  HYPD3 (270000 , 1) 

C HYPD3(270000, 1)  IS  PR(T)  FOR  EACH  TABLE. 

INTEGER  J,SCD 

DOUBLE  PRECISION  CA(5500) ,K,FF,CC(5500) 

DOUBLE  PRECISION  CHI (270000) ,CHI1 (270000) ,CHIOBS 


COMMON/ Cl 1/ ik , mxs , mxz , mxd , Ige , it  ab , hyp , ds , ipar , kl , k2 , ierr , pobsh 
COMMON/ CI2/hypd , inf hyl , inf hyu , POBSHl , PEXIMP , PEX 
COMMON/PARAM/CA , J , SCD , K , FF 

COMMON  /DKIM/  DENO , ITOT, ISUML, INUM,HYPD2 , INUMl , HYPDl 
c COMMON  /DKIMl/  POBSHl , ISUMTS , IK 

COMMON  /DKIMl/  ISUMTS 
COMMON  / CHI/  CHI,CHIOBS 


DO  250  I=1,K2-K1+1 
250  CC(I)=O.DO 
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DO  300  I=1,IT0T 

IL=INUM(I,IK+1)+1 

CC  ( ID  =CC  ( IL)  +HYPD2  ( I , IK+ 1 ) 

300  CONTINUE 

c aa=0.d0 

DO  310  I=1,K2-K1+1 
CC(I)=CC(I)/DENO 
c aa=aa+cc(i) 

c print*, i, cc(i) ,aa 

310  continue 

DO  320  I=1,IT0T 

IM=INUM(I,IK+1)+1 

HYPD3(I,1)=CC(IM) 

320  CONTINUE 

IC0UNT1=0 
DO  400  I=1,IT0T 

C IF  (INUM(I,IK+1)  .EQ.  ISUMTS)  THEN 

IF  (HYPD3(I,1)  .EQ.  CC(J))  THEN 
IC0UNT1=IC0UNT1+1 
INUMl (ICOUNTI , 1)=INUM(I , IK+1) 

HYPDl (ICOUNTI , 1) =HYPD2(I , IK+1) 
CHI1(IC0UNT1)=CHI(I) 

END  IF 

400  CONTINUE 

PT0BS3=0.D0 

PT0BS5=0.D0 

DO  560  1=1, ICOUNTI 
C IF  (INUMl (1,1)  .EQ.  ISUMTS 

C 1 .AND.  HYPDl (1,1)  .LE.  POBSHl)  THEN 
C IF  (HYPDl (1,1)  .LE.  POBSHl)  THEN 

C CHI-SQUARED  STATISTIC  IS  USED  FOR  SECONDARY  PARTITION. 
IF  (CHIl(I)  .GE.  CHIOBS)  THEN 
PT0BS3=PT0BS3+HYPD1 (1,1) 

ELSE 

PT0BS5=PT0BS5+HYPD1 (1,1) 

END  IF 

560  CONTINUE 


PT0=PT0BS3/DEN0 
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ATD=PT0BS5/DEN0 

PT=CC(J) 

C ATO  IS  INCLUDED  IN  ACCEPTANCE  REGION. 

C PRINT*, 'IMPROVED  P(T=T_0)  = ' ,PTO 
c PRINT*, 'ORD  P - IMPROVED  P(T=T_0)  = ' ,ATO 

C CC(J)=PTO 

RETURN 

END 


C234567 

C GIVEN  ALPHA,  STARTING  VALUE,  ITERA  ITERATES  AND  RETURNS  RHO  A LIMIT. 

SUBROUTINE  ITERA 1 (ALPHA, START, RHO 1 , ist , JCIO , PALPHA) 

IMPLICIT  REAL*8  (A-H,0-Z) 

PARAMETER(EPS=1 . d-14) 

PARAMETER (NNIT= 1 000 ) 
c PARAMETER ( EP  S = 1 . d- 6 ) 

integer  itab(1000,4) ,infhyl(1000) ,infhyu(1000) 
double  precision  hyp(0 : 2000) ,ds(0 : 1 , 0 : 5500) , lge,POBSH,PSI 
DOUBLE  PRECISION  hypd(lOOO , 0 : 2000) , POBSHl ,PEXIMP ,PEX 
DOUBLE  PRECISION  SRT(NNIT,2) ,SRTS(NNIT) ,SRTSR(NNIT,2) 

DOUBLE  PRECISION  SRTl (NNIT, 2) , SRTSl (NNIT) ,SRTSR1 (NNIT, 2) 

INTEGER  J,SCD 

DOUBLE  PRECISION  CA(5500) ,K,FF 

COMMON/ Cl 1/ik , mxs ,mxz , mxd , Ige , itab , hyp , ds , ipar , kl ,k2 , ierr , pobsh 
COMMON/ CI2/hypd , inf hyl , inf hyu , POBSHl , PEXIMP , PEX 
COMMON/PARAM/CA , J , SCD , K , FF 
COMMON/ITN/SRT , SRTS , SRTSR , SRTl , SRTS 1 , SRTSRl 

C IF  (IST  .EQ.  1)  THEN 

C OPEN (UNIT=33 , FILE= ' mp_lo_ci . out ' ) 

C OPEN (UNIT=37 , FILE= ' mp_lo_all . out ' ) 

C ELSE 

C OPEN (UNIT=34,FILE= 'mp_up_ci . out ' ) 

C OPEN (UNIT=38 , FILE= ’ mp_up_all . out ‘ ) 

C ENDIF 
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DO  5 JJI=1,NNIT 
SRT(JJI,l)=O.DO 
SRT(JJI,2)=0.D0 
SRTS(JJI)=O.DO 
SRTSR(JJI,l)=O.DO 
SRTSR(JJI,2)=0.D0 
SRTl(JJI,l)=O.DO 
SRT1(JJI,2)=0.D0 
SRTSl(JJI)=O.DO 
SRTSRl(JJI,l)=O.DO 
5 SRTSR1(JJI,2)=0.D0 


PSI=START 

RHO=O.DO 

KL=0 

ITE=0 

ITE1=0 

AALP=ALPHA/2.D0 

IF  (J  .EQ.  1 .OR.  J .EQ.  SCD)  AALP=ALPHA 


C FOR  STERNE'S  Cl,  JCI=1  SHOULD  BE  ASSIGNED  FOR  ODR  COMPUTATION, 

C WHENEVER  WE  CALL  CNV2X2 . 

JCI=1 

10  call  cnv2x2(ik,mxs,mxz,mxd,lge,itab,hyp,ds,ipar,kl,k2,ierr,pobsh, 

1 jci,PSI) 

C JCI0=3 

cc  JCI0=0 

CALL  IMPROV ( ik , it ab , hypd , inf hy 1 , inf hyu , POBSHl , PEXIMP , 

1 PEX, JCIO,PSI) 

ITE1=ITE1+1 

SRT1(ITE1,1)=PSI 

SRT1(ITE1,2)=PEXIMP 


IF  (ITEI  .GT.  1000)  THEN 

PRINT*, 'NOT  CONVERGE  IN  ONE-SIDED  MODIFIED  P' 
RH01=-99999. 99999 
PALPHA=-99999 . 99999 
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GO  TO  99 

END  IF 


C print*, 'psi,  PEXIMP  =' ,psi ,PEXIMP 
C IF  (PEXIMP  .GE.  ALPHA/2. DO)  THEN 

IF  (PEXIMP  .GE.  AALP)  THEN 
IN=1 

ELSE 

IN=0 

ITE=ITE+1 

SRT(ITE,1)=PSI 

SRT(ITE,2)=PEXIMP 

END  IF 
C 

C KL  IS  0 UNTIL  CORRECT  VALUE  IS  SPANNED  BY  RHO  AND  OPSI, 

C THEN  KL  IS  SET  TO  1 . 

C 

C IN=1  IF  PSI  IS  TOO  LARGE,  ELSE  IN=0 . 

C 

IF  (KL  .EQ.  1)  GO  TO  40 
IF  (IN  .EQ.  1)  GO  TO  20 
RHO=PSI 

if  (ist  .eq.  1)  PSI=PSI*1 . OIDO 
if  (ist  .eq.  2)  PSI=PSI*0 . 99D0 

GO  TO  10 
20  KL=1 

OPSI=PSI 

30  PSI=(RH0+0PSI)*0.5D0 

C 

C NEW  ESTIMATE  IS  MIDPOINT  OF  SPANNING  INTERVAL 
GO  TO  10 

40  IF  (IN  .EQ.  1)  OPSI=PSI 
IF  (IN  .NE.  1)  RHO=PSI 

c IF  (DABS(RHO/OPSI  -l.DO)  .LT.  EPS)  RETURN 

IF  (DABS(RHO/OPSI  -l.DO)  .LT.  EPS)  THEN 
JCI=1 
PSI=RHO 

call  cnv2x2(ik,mxs,mxz,mxd,lge,itab,hyp,ds,ipar,kl,k2,ierr,pobsh, 

1 jci,PSI) 
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CALL  IMPROV (ik , itab ,hypd , inf hyl , inf hyu , POBSHl , PEXIMP , 
1 PEX,JCIO,PSI) 

C print*, 'FINAL  LIMIT  : psi,  PEXIMP  =' ,psi , PEXIMP 

ITE=ITE+1 

SRT(ITE,1)=PSI 

SRT(ITE,2)=PEXIMP 

IF  (ITEl  .GT.  NNIT)  THEN 

PRINT* INCREASE  NNIT  FOR  ARRAYS  SRT,SRTS' 

GO  TO  99 
ELSE 

C PRINT*, 'NO  OF  ITERATION  ITE,ITE1=  ',ITE,ITE1 

C PRINT* 

END  IF 

DO  100  JJI=1,ITE 
100  SRTS(JJI)=SRT(JJI,2) 

CALL  SHELL1(ITE,SRTS) 

DO  200  1=1, ITE 
DO  210  11=1, ITE 

IF  (SRTS(I)  .EQ.  SRT(I1,2))  THEN 
SRTSR(I,1)=SRT(I1,1) 

SRTSR(I,2)=SRT(I1,2) 

GO  TO  200 

END  IF 

210  CONTINUE 
200  CONTINUE 

C IF  (1ST  .Eq.l)  THEN 

C DO  105  JJI=ITE,1,-1 

C105  WRITE(33,107)  J JI , SRTSR( J JI , 1) ,SRTSR( J JI , 2) 

C ELSE 

C DO  106  JJI=ITE,1,-1 

C106  WRITE(34,107)  JJI ,SRTSR( JJI , 1) ,SRTSR( JJI ,2) 

C ENDIF 

107  F0RMAT(I10,2F20.15) 

C PALPHA=SRTSR(ITE,2) 

C RH01=SRTSR(ITE,1) 


C SORTING  BY  THETA 

DO  300  JJI=1,ITE1 
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300  SRTS1(JJI)=SRT1(JJI,1) 

CALL  SHELL1(ITE1,SRTS1) 

C FOR  THE  LOWER  LIMIT  P.VALUE  IS  SAVED  IN  ASCENDING  ORDER. 
IF  (1ST  .EQ.  1)  THEN 
DO  310  I=1,ITE1 
DO  320  11=1, ITEl 

IF  (SRTSl(I)  .Eq.  SRT1(I1,1))  THEN 
SRTSR1(I,1)=SRT1(I1,1) 
SRTSR1(I,2)=SRT1(I1,2) 

SRT1(I1,1)=0.D0 

C FOR  THE  SAKE  OF  THE  SAME  THETA. 

GO  TO  310 

END  IF 

320  CONTINUE 
310  CONTINUE 

C FOR  THE  UPPER  LIMIT  P_VALUE  IS  SAVED  IN  ASCENDING  ORDER. 
C THAT  IS,  THETA  IS  SAVED  IN  DESCENDING  ORDER. 

ELSE 

DO  330  1=1, ITEl 
DO  340  I1=ITE1,1,-1 

IF  (SRTSl(I)  .EQ.  SRTKII,!))  THEN 
SRTSR1(ITE1-I+1,1)=SRT1(I1,1) 
SRTSR1(ITE1-I+1,2)=SRT1(I1,2) 
SRT1(I1,1)=0.D0 

C FOR  THE  SAKE  OF  THE  SAME  THETA. 

GO  TO  330 

ENDIF 

340  CONTINUE 
330  CONTINUE 
ENDIF 

DO  350  JJI=1,ITE1 
IF  (SRTSR1(JJI,1)  .EQ.  RHO)  THEN 
ITE2=JJI 
GO  TO  360 

ENDIF 

350  CONTINUE 
360  CONTINUE 

C PRINT*, 'LIMIT (RHO)  (ITE2)  =',ITE2 
ITE3=ITE2 

DO  400  JJI=ITE2,1,-1 

400  IF  (SRTSR1(JJI,2)  .GT.  AALP)  ITE3=JJI 
C400  IF  (SRTSR1(JJI,2)  .GT.  ALPHA/2. DO)  ITE3=JJI 
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IF  (ITE3  .NE.  ITE2)  THEN 
ITE4=ITE3-1 

ELSE 

ITE4=ITE3 

ENDIF 

C PRINT* , ' ITE3 , ITE4  = ' , ITE3 , ITE4 

C FIND  THE  MAXIMUM  P.VALUE  WHICH  CAN  NOT  EXCEED  ALPHA/2  AND  ITS  THETA. 
TMAX=SRTSR1(ITE4,1) 

PRMAX=SRTSR1 ( ITE4 ,2) 

PALPHA=PRMAX 

RH01=TMAX 

C PRINT*, 'CLOSER  ALPHA/2=' ,SRTSR(ITE, 1) ,SRTSR(ITE,2) 

C PRINT*, 'LIMIT(RHO)  = ' , TMAX , PRMAX 

C IF  (1ST  .EQ.l)  THEN 

C DO  420  JJI=ITE1,1,-1 

C420  WRITE(37,107)  J JI , SRTSRl ( JJI , 1) , SRTSRl ( J JI , 2) 

C ELSE 

C DO  430  JJI=ITE1,1,-1 

C430  WRITE(38,107)  J JI , SRTSRl (JJI , 1) , SRTSRl (JJI , 2) 

C ENDIF 


99  RETURN 
ENDIF 

GO  TO  30 
END 


****  *****  ***  ***  ,|c  Xc  * !)C  **  :(t  =(c  * ^ ;)c  * =tc  :)c  ****  )|c  ;)c  Jjc 

* SHELL  SORT 
C234567 

SUBROUTINE  SHELL (N,ARR) 

c Sorts  an  array  ARR  of  length  N into  ascending  numerical  order, 

c by  the  Shell-Mezgar  algorithem  (diminishing  increment  sort) . 

c N is  input;  ARR  is  replaced  on  output  by  its  sorted  rearrangement. 

IMPLICIT  REAL*8  (A-H,0-Z) 

PARAMETER  (ALN2I=1 . DO/O . 69314718 , TINY=l.E-5) 

REAL*8  ARR(5500) 

L0GNB2=INT (ALOG (FLOAT (N) ) *ALN2I+TINY) 
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M=N 

DO  12  NN=1,L0GNB2 
M=M/2 
K=N-M 

DO  11  J=1,K 
I=J 

3 CONTINUE 

L=I+M 

IF(ARR(L)  .LT.ARR(D)  THEN 
T=ARR(I) 

ARR(I)=ARR(L) 

ARR(L)=T 

I=I-M 

IF(I.GE.l)GO  TO  3 
END  IF 

11  CONTINUE 

12  CONTINUE 
RETURN 
END 

C234567 

SUBROUTINE  SHELL 1 (N,ARR) 

c Sorts  an  array  ARR  of  length  N into  ascending  numerical  order, 

c by  the  Shell-Mezgar  algorithem  (diminishing  increment  sort) . 

c N is  input;  ARR  is  replaced  on  output  by  its  sorted  rearrangement. 

IMPLICIT  REALMS  (A-H,0-Z) 

PARAMETER  (ALN2I=1 . DO/O . 69314718 , TINY=l.E-5) 

PARAMETER  (NNIT=1000) 

REAL*8  ARR(NNIT) 

L0GNB2=INT (ALOG (FLOAT (N) ) *ALN2I+TINY) 

M=N 

DO  12  NN=1,L0GNB2 
M=M/2 
K=N-M 

DO  11  J=1,K 
I=J 

3 CONTINUE 

L=I+M 

IF(ARR(L)  .LT.ARR(D)  THEN 
T=ARR(I) 

ARR(I)=ARR(L) 

ARR(L)=T 

I=I-M 

IF(I.GE.1)G0  TO  3 
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ENDIF 

11  CONTINUE 

12  CONTINUE 
RETURN 
END 


SUBROUTINE  ITERAIO (ALPHA , START ,RH01 , ist , PALPHA ,HYPD10) 

C GIVEN  ALPHA,  STARTING  VALUE,  ITERA  ITERATES  AND  RETURNS  RHO 
C A LOWER  LIMIT. 

IMPLICIT  REALMS  (A-H,0-Z) 

PARAMETER(EPS=1 . d-14) 

PARAMETER(NNIT=1000) 
c PARAMETER ( EP  S = 1 . d- 6 ) 

integer  itab (1000 ,4) , inf hyl (1000) , infhyu(lOOO) 
double  precision  hyp(0 ; 2000) ,ds(0 : 1 ,0 : 5500) , lge,POBSH 
DOUBLE  PRECISION  hypd(1000,0 :2000) ,P0BSH1 ,PEXIMP,PEX 
DOUBLE  PRECISION  SRT(NNIT,2) ,SRTS(NNIT) ,SRTSR(NNIT,2) 
DOUBLE  PRECISION  SRTl (NNIT, 2) , SRTSl (NNIT) , SRTSRl (NNIT, 2) 
INTEGER  J,SCD 

DOUBLE  PRECISION  CA(5500) ,K,FF,CAl(5500) ,CA2(5500) ,CC(5500) 
DOUBLE  PRECISION  HYPD2 (270000 , 20) ,HYPD10 (270000 , 1) 


COMMON/CIl/ik,mxs,mxz,mxd,lge, itab, hyp, ds,ipar,kl,k2,ierr,pobsh 

COMMON/ CI2/hypd , inf hyl , inf hyu , POBSHl , PEXIMP , PEX 

COMMON/PARAM/CA , J , SCD , K , FF 

COMMON/ITN/SRT , SRTS , SRTSR , SRTl , SRTS 1 , SRTSRl 


PSI=START 

RHO=O.DO 

KL=0 

ITE=0 

ITE1=0 


C FOR  STERNE'S  Cl,  JCI=1  SHOULD  BE  ASSIGNED  FOR  ODR  COMPUTATION, 

C WHENEVER  WE  CALL  CNV2X2 . 

JCI=1 

10  call  cnv2x2 ( ik , mxs , mxz ,mxd , Ige , itab , hyp , ds , ipar ,kl , k2 , ierr ,pobsh , 
1 jci,PSI) 

JCI0=3 

CALL  IMPROV ( ik , it ab , hypd , inf hyl , inf hyu , POBSHl , PEXIMP , 
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1 PEX, JCIO.PSI) 

C COMPUTE  MODIFIED  EXACT  ALTERNATIVE  PROB  DISTN. 

CALL  C0MPT10(CC,AT0,PT,HYPD10) 

RETURN 

END 

C234567 

SUBROUTINE  COMPT 1 0 ( CC , ATO , PT , HYPD 1 0 ) 

IMPLICIT  REALMS  (A-H.O-Z) 

PARAMETER(EPS=1 .d-14) 

integer  itab(lOOO,4) , infhyl (1000) , inf hyu( 1000) 

INTEGER  INUM(270000, 20) ,INUM1 (270000,1) 

double  precision  hyp (0 : 2000) ,ds (0 : 1 , 0 : 5500) , Ige ,P0BSH 
DOUBLE  PRECISION  hypd(lOOO , 0 : 2000) , POBSHl ,PEXIMP , PEX 
DOUBLE  PRECISION  HYPD1(270000, 1) ,HYPD2(270000,20) 
DOUBLE  PRECISION  HYPD3(270000, 1) ,HYPD10(270000 , 1) 

C HYPDIO (270000,1)  IS  PR(T)  FOR  EACH  TABLE. 

INTEGER  J,SCD 

DOUBLE  PRECISION  CA(5500) ,K,FF,CC(5500) 


COMMON/ Cl 1/ik , mxs , mxz , mxd , Ige , it ab , hyp , ds , ipar , kl ,k2 , ierr , pobsh 
C0MM0N/CI2/hypd, infhyl , infhyu, POBSHl ,PEXIMP , PEX 
COMMON/PARAM/CA , J , SCD , K , FF 

COMMON  /DKIM/  DENO , ITOT, ISUML, INUM,HYPD2 , INUMl ,HYPD1 
c COMMON  /DKIMl/  POBSHl , ISUMTS , IK 

COMMON  /DKIMl/  ISUMTS 


DO  300  1=1, ITOT 
HYPD 10(1,1) =HYPD2 ( I , I K+ 1 ) /DENO 
300  CONTINUE 


RETURN 

END 
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C EFFICIENT  SCORE  TEST  STATISTICS  4 

C234567 

SUBROUTINE  CMHNN 1 (NROW , NCOL , NSTM , NIK , N JK , NTOT , MATRIX , CMH , G 
1 JCIO.FIT) 

C TO  COMPUTE  THE  EFFICIENT  SCORE  TEST  STATISTIC. 

C MAX  NO.  OF  STRATUM;  1000 

C NO.  OF  ROW  AND  COLUMN  : 2,  2 

C COMMON  IS  USED  FOR  NIK, NJK, NTOT 

C IMPLICIT  REALMS  (A-H,0-Z) 

DOUBLE  PRECISION  X,EV,CMH,G 
DOUBLE  PRECISION  FIT(1000,2,2) 

INTEGER  MATRIX(1000,2,2) ,NIK(2,1000) ,NJK(2,1000) ,NT0T(1000) 
C COMMON  /Al/  NIK, NJK, NTOT 

C DO  90  K=1,NSTM 

C WRITE(*,1000)K,MATRIX(K,1,1) ,MATRIX(K,1,2) ,MATRIX(K,2  1) 

C 1 MATRIX(K,2,2) 

C WRITE(*,1001)K,NIK(1,K) ,NIK(2,K) ,NJK(l,K) ,NJK(2,K) ,NTOT(K) 

C90  CONTINUE 

CIOOO  FORMATC 'DATA' ,5110) 

ClOOl  FORMATC' TOTAL' ,6110) 

X=O.DO 

G=O.DO 

DO  100  K=1,NSTM 
DO  110  1=1, NROW 
DO  110  J=1,NC0L 

IF  (JCIO  .EQ.  0)  THEN 

EV= (NIK ( I , K) *N JK ( J , K) ) /DBLE (NTOT (K) ) 

ELSE 

EV=FIT(K,I, J) 

ENDIF 

X=X+ ( (DBLE (MATRIX (K , I , J) ) -EV) **2) /EV 

C IF  (MATRIX (K, I, J)  .EQ.  0)  GO  TO  110 

C G=G+DBLE (MATRIX (K , I , J) ) *DLOG (DBLE (MATRIX (K , I , J) ) /EV) 

110  CONTINUE 
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100  CONTINUE 
CMH=X 

C G=2.D0*G 


C WRITE(*,1010)CMH,G 

ClOlO  FORMAT ('CHI-SQUARED  STATISTIC,  G*2  =',2F12.7) 

RETURN 

END 

c ITERATIVE  PROPORTIONAL  FITTING  ALGORITHM 

C (XZ,YZ)  WITH  N_{11K}=0R  FOR  K=1,IK,  1 OTHERWISE. 

SUBROUTINE  IPF (PSI , IK , MATRIX , FIT) 

C IMPLICIT  REAL*8(A-H,0-Z) 

C MAX  NO.  OF  STRATA=100 
C MAX  NO.  OF  ITERATI0N=2000 

PARAMETER(EPS=1 . D-8) 

DOUBLE  PRECISION  X (2 , 2 , 100) , E(2000 , 2 , 2 , 100) , EE(3 , 2 , 100) 
DOUBLE  PRECISION  XX(3,2,100) 

DOUBLE  PRECISION  ETH(lOO) ,FIT( 1000 , 2 , 2) 

DOUBLE  PRECISION  THETA, PSI ,XA,EA, PI ,P2 ,P3 
DOUBLE  PRECISION  FSIK(2 , 100) ,FSJK(2, 100) 


INTEGER  MATRIX (1000, 2, 2) 


THETA=PSI 
DO  5 K=1,IK 
DO  5 1=1,2 
DO  5 J=l,2 

5 X(I,J,K)=DBLE(MATRIX(K,I, J)) 


XA=O.DO 
DO  10  1=1,2 
DO  10  J=l,2 
DO  11  K=1,IK 
XA=XA+X(I, J,K) 
XX(1,I,J)=XA 
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XA=O.DO 
10  CONTINUE 

XA=O.DO 
DO  20  1=1,2 
DO  20  K=1,IK 
DO  21  J=l,2 
21  XA=XA+X(I,J,K) 

XX(2,I,K)=XA 

XA=O.DO 

20  CONTINUE 

XA=O.DO 
DO  30  J=l,2 
DO  30  K=1,IK 
DO  31  1=1,2 
31  XA=XA+X(I,J,K) 

XX(3,J,K)=XA 

XA=O.DO 

30  CONTINUE 

C 

C SET  TO  1 FOR  INITIAL  VALUE  OF  EXPECTED  VALUE 

C 

DO  40  1=1,2 
DO  40  J=l,2 
DO  50  K=1,IK 

IF  (I  .EQ.  1 .AND.  J .EQ.  1)  THEN 
E(1,I, J,K)=THETA 
ELSE 

E(1,I, J,K)=1.D0 
END  IF 

50  CONTINUE 

C50  E(1,I, J,K)=1.D0 

40  CONTINUE 

C 

C COMPUTATION  ROUTINE 

C 

N=1 

C KK=1 

KK=2 

2222  N=N+1 

IF  (N  .GT.  2000)  THEN 

PRINT*, INCREASE  ARRAY  E(2000 , 2 , 2 , 100)  IN  IPF 


PRINT*, 'IT  DOES  NOT  CONVERGE  WITHIN  2000  ITERATIONS. 
GO  TO  999 
END  IF 

IF  (KK  .EQ.  1)  GO  TO  1000 

IF  (KK  .EQ.  2)  GO  TO  2000 

IF  (KK  .EQ.  3)  GO  TO  3000 

C STEP  1 

1000  EA=O.DO 

DO  45  1=1,2 
DO  45  J=l,2 
DO  46  K=1,IK 
46  EA=EA+E(N-1,I,J,K) 

EE(1,I,J)=EA 

EA=O.DO 

45  CONTINUE 

DO  100  1=1,2 
DO  100  J=l,2 
DO  100  K=1,IK 

E(N,I,J,K)=XX(1,I,J)*E(N-1,I,J,K)/EE(1,I,J) 

100  CONTINUE 
KK=KK+1 
GO  TO  555 

C 

C STEP  2 

C 

2000  EA=O.DO 

DO  57  1=1,2 
DO  57  K=1,IK 
DO  51  J=l,2 

51  EA=EA+E(N-1,I, J,K) 

EE(2,I,K)=EA 

EA=O.DO 

57  CONTINUE 

DO  101  1=1,2 
DO  101  J=l,2 
DO  101  K=1,IK 

E(N,I,J,K)=XX(2,I,K)*E(N-1,I,J,K)/EE(2,I,K) 

101  CONTINUE 
KK=KK+1 
GO  TO  555 

C 

C STEP  3 
C 

3000  EA=O.DO 
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DO  60  J=l,2 
DO  60  K=1,IK 
DO  61  1=1,2 

61  EA=EA+E(N-1,I, J,K) 

EE(3, J,K)=EA 
EA=O.DO 

60  CONTINUE 

DO  102  1=1,2 

DO  102  J=l,2 

DO  102  K=1,IK 

E(N,I, J,K)=XX(3,J,K)*E(N-1,I,J,K)/EE(3,J,K) 

102  CONTINUE 

C KK=1 

KK=2 

GO  TO  555 
C 

C CHECK  CONVERGENCE 

C 

555  DO  103  1=1,2 
DO  103  J=l,2 
DO  103  K=1,IK 
N1=N-1 
N2=N-2 

IF  (N1  .LT.  0)  Nl=l 
IF  (N2  .LT.  0)  N2=l 
P1=DABS(E(N,I, J,K)-E(N1,I,J,K)) 

P2=DABS (E(N , I , J , K) -E(N2 , I , J , K) ) 

P3=DABS (E(N1 , I , J , K) -E (N2 , I , J ,K) ) 

IF  (PI  .GT.  EPS  .OR.  P2  .GT.  EPS  .OR.  P3  .GT.  EPS) 
1 GO  TO  2222 

103  CONTINUE 
GO  TO  1111 

C 

C PRINT 
C 

nil  CONTINUE 

DO  666  K=1,IK 
DO  667  1=1,2 
DO  667  J=l,2 

667  FIT(K,I,J)=E(N,I,J,K) 

666  CONTINUE 

C WRITE(*,131) 

C WRITE(*,123)  (((X(I,J,K) ,K=1,IK) ,J=1,2) ,1=1,2) 
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C WRITE(*,132) 

C DO  157  11=1,3 

C WRITE(*,124)((XX(I1,J,K),K=1,IK),J=1,2) 

C157  CONTINUE 
C WRITE (*,133) 

C WRITE(*,134) 

C DO  77  JJ=1,N 

C ESTIMATED  ORS  FOR  EACH  STRATUM 
C DO  78  K=1,IK 

C78  ETH(K)=E(JJ,1,1,K)*E(JJ,2,2,K)/(E(JJ,1,2,K)*E(JJ,2,1,K)) 

C NN=JJ-1 

C WRITE(*,125)NN, (((E(JJ,I,J,K) ,K=1,IK) , J=1 , 2) , 1=1 , 2) , 

C 1 (ETH(K),K=1,IK) 

C77  CONTINUE 

C WRITE(*,135)N-1 



C CHECK  IF  OBSERVED  AND  FITTED  FREQUENCIES  MATCH. 

C XX(2,I,K)  : NIK(I,K)  <->FSIK(I,K) 

C XX(3,J,K)  : NJK(J,K)  <->FSJK(J,K) 

C FSIK(I,K) ,FSJK(J,K) 

DO  670  K=1,IK 
DO  680  1=1,2 
680  FSIK(I,K)=O.DO 
DO  690  J=l,2 
690  FSJK(J,K)=O.DO 
670  CONTINUE 

DO  700  K=1,IK 
DO  710  1=1,2 
DO  710  J=l,2 

710  FSIK(I,K)=FSIK(I,K)+FIT(K,I,J) 

DO  720  J=l,2 
DO  720  1=1,2 

720  FSJK(J,K)=FSJK(J,K)+FIT(K,I, J) 

700  CONTINUE 

C WRITE(*,140) 

C WRITE(*,124)((FSIK(I,K) ,K=1 , IK)  , 1=1 , 2) 

C WRITE(*,141) 

C WRITE(*,124)((FSJK(J,K),K=1,IK),J=1,2) 

140  F0RMAT(/,10X, 'X-Z  MARGINAL  DATA  FOR  FITTED  VALUES  0 
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141  FORMAT(/,10X, 'Y-Z  MARGINAL  DATA  FOR  FITTED  VALUES') 

DO  730  K=1,IK 
DO  730  1=1,2 

IF  (DABS(XX(2,I,K)-FSIK(I,K))  .GT.  EPS)  THEN 
PRINT*,I,K,XX(2,I,K),FSIK(I,K),XX(2,I,K)-FSIK(I,K) 

PRINT*, I, K,'  OBSERVED  AND  FITTED  FREQUENCIES  DOES  NOT  MATCH  ' 
PRINT*,'  IN  X-Z  MARGINAL  TABLE.' 

END  IF 

730  CONTINUE 

DO  740  K=1,IK 
DO  740  J=l,2 

IF  (DABS(XX(3, J,K)-FSJK(J,K))  .GT.  EPS)  THEN 
PRINT*, J,K,XX(3,J,K),FSJK(J,K),XX(3,J,K)-FSJK(J,K) 

PRINT*, J,K,'  OBSERVED  AND  FITTED  FREQUENCIES  DOES  NOT  MATCH  ' 
PRINT*,'  IN  Y-Z  MARGINAL  TABLE.' 

ENDIF 

740  CONTINUE 


123  F0RMAT(10(8F9.3,/)) 

124  F0RMAT(20(5F9.3,/)) 

125  F0RMAT(I3,1X,10(8F9.3,/)) 

131  FORMAT (1 OX, 'OUTPUT' ,/,10X, 'DATA') 

132  F0RMAT(/,10X, 'MARGINAL  DATA  FOR  EACH  STEP') 

133  FORMAT (/,! OX, 'EXPECTED  VALUE  IN  EACH  ITERATION') 

134  F0RMAT(6X,'  M(lll)  M(112)  M(121)  M(122)  M(211)  M(212) ' , 

1 ' M(221)  M(222)  ORl  0R2') 

135  FORMAT (1 OX, 'CONVERGENCE  IN  ',15,'  ITERATIONS.') 

999  RETURN 
END 


APPENDIX  B 

SOURCE  CODE  EOR  SIMULATION 


Eollowing  are  program  structure  and  part  of  FORTRAN  source  code  for  approxi- 
mating exact  inference  about  conditional  association  in  / x J x K contingency  tables. 
It  shows  how  the  estimate  of  the  ordinary  or  modified  exact  P-value  for  six  tests  can 
be  constructed. 


B.l  Program  Structure 

Important  parameters  are  defined  as  follows. 

NROW  . Integer  : input  : number  of  rows  in  the  observed  matrix 
NCOL  . Integer  : input  : number  of  columns  in  the  observed  matrix 
NSTM  . Integer  : input  : number  of  strata  in  the  observed  matrix 

NROWTl  : Integer  array(50)  : output  : vector  of  row  totals  for  the  observed  matrix 
at  each  stratum 

NCOLTl  : Integer  array(50)  : output  ; vector  of  column  totals  for  the  observed 
matrix  at  each  stratum 

NROWT  : Integer  array(20,50)  ; output  : NROWTl  is  combined  for  all  the  strata 
NCOLT  ; Integer  array(20,50)  : output  : NCOLTl  is  combined  for  all  the  strata 
NTOT  : Integer  array(20)  : output  ; vector  of  stratum  totals  for  the  observed  table 
JWORK  : Integer  array(50)  : output  : workspace 

MATRIX  1 : Integer  array  (50,50)  : output  ; the  randomly  generated  two-way  table 
at  each  stratum 

MX  . Integer  array(20,50,50)  : input  : the  observed  three-way  table 
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MATRIX  : Integer  array (20, 50, 50)  : output  : the  randomly  generated  three-way  table 
NCODE  : Integer  : input  ; select  the  type  of  tests  of  conditional  independence 
NRCM  ; Integer  : input  : (NROW-l)x(NCOL-l) 

IDUM  : Negative  Integer  : input  : Seed 

C;MH  : double  precision  : output  : score  statistic 

Important  subroutines  are  defined  as  follows. 

Subroutine  RCONT2 

(NROW,NCOL,NSTM,NROWTl,NCOLTl,JWORK,MATRIXl,KEY,IFAULT,IDUM) 

. Generate  Two-Way  random  tables  with  given  marginal  totals 

Subroutine  COMPTOT(K,NROW,NCOL,MX,NROWT,NCOLT,NTOT) 

; Compute  row,  column,  and  stratum  totals 

Double  precision  Function  RANI  (IDUM) 

: Uniform  Random  Number  Generator,  which  is  used  in  Subroutine  RCONT2 

Subroutine  GETWTS(NROW,NCOL,WTR,WTC,NCODE) 

: Get  scores  if  ordinal  variable  is  used 

Subroutine  CMHNN(NRCM,NROW,NCOL,NSTM,MATRIX,CMH) 

: Compute  score  statistic  assuming  no  three-factor  interaction  when  both  X and  Y 
are  nominal 

Subroutine  CMHNO(NROW,NCOL,NSTM,MATRIX,CMH) 

; Compute  score  statistic  assuming  no  three-factor  interaction  when  W is  nominal. 
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and  Y is  ordinal 

Subroutine  CMHOO(NROW,NCOL,NSTM, MATRIX, CMH) 

; Compute  score  statistic  assuming  no  three-factor  interaction  when  both  X and  Y 
are  ordinal 

Subroutine  CMHNN1(NRCM,NR0W,NC0L,NSTM, MATRIX, CMH) 

: Compute  score  statistic  permitting  three-factor  interaction  when  both  A'  and  Y are 
nominal 

Subroutine  CMHN01(NR0W,NC0L,NSTM,MATRIX,CMH) 

: Compute  score  statistic  permitting  three-factor  interaction  when  X is  nominal,  and 
Y is  ordinal 

Subroutine  CMH001(NR0W,NC0L,NSTM,MATRIX,CMH) 

: Compute  score  statistic  permitting  three-factor  interaction  when  both  .Y  and  Y are 
ordinal 


Other  subroutines  are  involved  to  compute  inverse  matrix,  matrix  multiplication, 
and  Kronecker  product  multiplication. 

B.2  Part  of  Source  Code 


PROGRAM  THREEWAY 
PARAMETER(lda=250) 

PARAMETER (ldal= 15) 

PARAMETER(epsilon=l . OE-14) 

IMPLICIT  REALMS  (A-H,0-Z) 

REAL*8  0(50,50) ,Dl(lda) ,VK(20,lda,lda) ,V(lda,lda) 
REALMS  DlV(lda) ,det(2) 


DIMENSION  NR0WT(20,50) ,NC0LT(20 , 50) .MATRIX (20, 50, 50) 
DIMENSION  JW0RK(50) ,MX(20,50,50) ,NT0T(20) ,NNT0T(20) 
DIMENSION  NR0WTK50)  ,NC0LT1(50)  ,MATRIX1  (50 , 50) 

DIMENSION  NR0WT2(50) ,NC0LT2(50) 

DIMENSION  NIK(50,20),NJK(50,20) 

C DIMENSION  NIJ(50,50) 

REAL*8  FACT(25001) ,WTR(50) ,WTC(50) 

LOGICAL  KEY 
LOGICAL  LSP,LSM 
LOGICAL  KIM 
C LOGICAL  KEYl 

COMMON  /B/  NROWM,NCOLM,FACT 
C COMMON  /Bl/  NIK,NJK 

COMMON  /Al/  NIK,NJK,NTOT 

COMMON  /A3/  D,Dl,VK,V,DIV,det 
COMMON  /A4/  WTR,WTC 

C 

COMMON  /TEMPRY/  HOP 
C 

DATA  MAXTOT  /25000/ 

C 

C 

input  Simulatioii  Informati  on  *♦***♦**:(=!(=*** 

C 

WRITE(*, 10000) 

10000  F0RMAT(3(/) ,T12, '*****  LxL5  (version  8.0  — 4/16/94) 

1 /,T12,'SIX  EFFICIENT  SCORE  STATISTICS',/, 

2 T12,'F0R  TESTING  CONDITIONAL  INDEPENDENCE',/, 

3 T12 , ' OF  THREE-WAY  TABLES . ' , /) 

write(* , 10001) 

10001  F0RMAT(T12,'  THIS  PROGRAM  CALCULATES',/, 

1 T12, 'PRECISE  ESTIMATES  AND  CONFIDENCE  INTERVALS',/, 

2 T12,'F0R  THE  MODIFIED  EXACT  P-VALUES . ' , / , 

3 T12,'THEY  UTILIZE  BOTH  SCORE  STATISTICS.',/) 

WRITE (*,45) 

45  FORMAT (/,/,/, 'ENTER  NUMBER  OF  STRATUMS:  ') 

READ(*,*)  NSTM 
WRITE(*,50) 


50  FORMAT (/, 'ENTER  NUMBER  OF  ROWS  AND  COLS:  ') 

READ(*,*)  NROW,NCOL 

52  WRITE(*,55) 

55  FORMAT (/, 'ENTER  CODE  FOR  TESTING:  ', 

1 1,1,'  ASSUMING  NO-THREE  FACTOR  INTERACTION  :', 

1 ! J ,’  1 : NOMINAL  BY  NOMINAL' , 

2 /,'  2 : NOMINAL  BY  ORDINAL', 

3 ! 3 : ORDINAL  BY  ORDINAL ' , / 

4 1,1,'  W/0  ASSUMING  NO-THREE  FACTOR  INTERACTION  : ' , 

1 />/.'  4 : NOMINAL  BY  NOMINAL' , 

2 ! ,'  5 : NOMINAL  BY  ORDINAL', 

3 ! ,'  6 : ORDINAL  BY  ORDINAL',/) 

READ(*,*)  NCODE 

IF  (NCODE. EQ.l  .OR.  NCODE. EQ. 2 .OR.  NCODE. EQ. 3 .OR. 

1 NCODE. EQ. 4 .OR.  NCODE. EQ. 5 .OR.  NCODE. EQ. 6)  GO  TO  57 
PRINT*, 'PLEASE  ENTER  THE  NUMBER  (1  TO  6).' 

GO  TO  52 

57  IF  (NCODE  .EQ.  1 .OR.  NCODE  .EQ.  4)  GO  TO  60 
CALL  GETWTS (NROW , NCOL , WTR , WTC , NCODE) 

60  WRITE(*,75) 

75  FORMAT (/, 'ENTER  OBSERVED  TABLES  FOR  EACH  STRATUM  (ROW  BY  ROW)-') 

DO  5 K=1,NSTM 

PRINT* ,' STRATUM  NUMBER  =' ,K 
DO  10  1=1, NROW 

READ(*,*)  (MX(K,I, J) , J=1,NC0L) 

10  CONTINUE 

5 CONTINUE 

WRITE(*,80) 

80  FORMAT(/, 'ENTER  NUMBER  OF  SIMULATION:') 

READ(*,*)  NSIM 

WRITE(*,85) 

85  FORMAT (/, 'ENTER  THE  SEED  (INTEGER)  :') 

READ(*,*)  ISEED 

C 


DO  20  K=1,NSTM 

CALL  COMPTOT (K , NROW , NCOL , MX , NROWT , NCOLT , NTOT) 


CONTINUE 


DO  23  K=1,NSTM 
NNTOT(K)=NTOT(K) 

CALL  SHELL (NSTM.NNTOT) 

NNTOTAL=NNTOT (NSTM) 

PRINT* 

PRINT*, 'MAX  OF  NTOT  =',NNTOTAL 
PRINT* 

PRINT*, 'ROW  TOTAL' 

DO  25  K=1,NSTM 

PRINT*, K,'  : ' ,(NROWT(K,I),I=l,NROW) 

CONTINUE 

PRINT* 

PRINT* ,' COLUMN  TOTAL' 

DO  27  K=1,NSTM 

PRINT*, K,'  : ',(NCOLT(K,I),I=l,NCOL) 

CONTINUE 

PRINT* 

PRINT* , ' STRATUM  TOTAL ' 

PRINT* , (NTOT(K) ,K=1 ,NSTM) 

PRINT* 

PRINT* 

CALL  MARTAB (NROW , NCOL , NSTM , NROWT , NCOLT , NIK , N JK) 
NRCM=(NROW-l)*(NCOL-l) 

CALL  CPOBS (NROW , NCOL , NSTM ,MX ,NNTOTAL , P_OBS) 
POBS=P_OBS 

IF  (NCODE.LE.3)  THEN 

IF  (NCODE.Eq.l)  THEN 

CALL  CMHNN (NRCM , NROW , NCOL , NSTM , MX , CMH) 
CMHOBS=CMH 

PRINT*, 'CMH, CMHOBS=' ,CMH,CMHOBS 
CALL  CMHNN 1 (NROW , NCOL , NSTM , MX , CMH) 
CMHOBSl=CMH 

PRINT*, 'CMH,CMHOBSl=' ,CMH,CMHOBSl 

ELSE 


IF  (NC0DE.EQ.2)  THEN 
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CALL  CMHNO(NROW,NCOL,NSTM,MX,CMH) 

CMHOBS=CMH 

CALL  CMHNN (NRCM , NROW , NCOL , NSTM ,MX , CMH) 
CMHOBSl=CMH 

ELSE 

CALL  CMHOQ (NROW, NCOL, NSTM, MX, CMH) 

CMHOBS=CMH 

CALL  CMHNO (NROW, NCOL, NSTM, MX, CMH) 

CMHOBSl=CMH 
END  IF 
END  IF 

ELSE 

IF  (NCODE.EQ.4)  THEN 

CALL  CMHNN 1 (NROW , NCOL , NSTM , MX , CMH) 

CMHOBS=CMH 

C NO  MORE  GENERAL  STATISTIC  FOR  T' ; WILL  USE  P({N}) 

ELSE 

IF  (NC0DE.EQ.5)  THEN 

CALL  CMHNO 1 (NROW , NCOL , NSTM , MX , CMH) 

CMHOBS=CMH 

CALL  CMHNN 1 (NROW , NCOL , NSTM , MX , CMH) 

CMHOBSl=CMH 

ELSE 

CALL  CMHOO 1 (NROW , NCOL , NSTM , MX , CMH) 

CMHOBS=CMH 

CALL  CMHNO 1 (NROW , NCOL , NSTM , MX , CMH) 

CMHOBSl=CMH 
END  IF 
ENDIF 
END  IF 

PRINT* 

PRINT*, 'THE  OBSERVED  PRIMARY  SCORE  STATISTIC  =',CMHOBS 
PRINT*, 'THE  OBSERVED  SECONDRY  SCORE  STATISTIC  =',CMHOBSl 
PRINT*, 'PROB  OF  OBSERVED  TABLE  = ' , SNGL (FOBS) 

PRINT* 

WRITE (*,28) 

^8  FORMAT (/, 'PRINT  EACH  RANDOM  TABLES  ? (Y=1,N=0)') 
READ(*,*)  NDATA 


begin  simulation  **********************:*: 


S=O.DO 
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ITC0UNT=0 
120  PRINT* 

PRINT*, 'RANDOM  TABLES  WITH  FIXED  MARGINS' 

PRINT* 

ITC0UNT=ITC0UNT+1 

IDUM=(-1)*ISEED 

DO  100  IJ=1,NSIM 

SH0P=1 .DO 

IF  (NDATA  .NE.  1)  GO  TO  103 
PRINT*, 'SIMULATION  NO  = ',IJ 

103  DO  105  K=1,NSTM 
KK=K 

DO  90  L=1,NR0W 
NROWTl (L) =NROWT(KK , L) 

90  CONTINUE 

DO  92  L=1,NC0L 
NC0LT1(L)=NC0LT(KK,L) 

92  CONTINUE 

CALL  RC0NT2 ( IJ , KK , NROW , NCOL , NSTM , NROWTl , NCOLTl , JWORK , 
1 MATRIX 1 , KEY , IFAULT , NNTOTAL , ISEED , IDUM) 

C PRINT*, 'PROB  OF  RANDOM  TABLE  =',HOP 

SHOP=SHOP*HOP 

DO  95  L=1,NR0W 
DO  96  M=1,NC0L 
MATRIX (KK , L ,M) =MATRIX1 (L , M) 

96  CONTINUE 

95  CONTINUE 

IF  (NDATA  .NE.  1)  GO  TO  97 

DO  98  1=1, NROW 

PRINT* , (MATRIX (KK , I , J) , J=1 , NCOL) 

98  CONTINUE 

PRINT* 

97  IF  (IFAULT  .NE.  0)  THEN 
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105 


PRINT* , ' IFAULT 
GO  TO  1000 
END  IF 
CONTINUE 


= \ IFAULT,  'KEY  = ' ,KEY 


C 

C 

C 

C 

C 


CALL  COMPMAR (NROW , NCOL , NSTM , MATRIX , NI J , KEYl ) 

POBS  : PROB  OF  OBSERVED  TABLE,  COMPUTED  FROM  CPOBS  FOR  TOTAL  STRATUM 
HOP  : PROB  OF  RANDOM  TABLE, COMPUTED  FROM  RC0NT2  FOR  ONE  STRATUM 
SHOP  : PROB  OF  RANDOM  TABLE , COMPUTED  FROM  RC0NT2  FOR  TOTAL  STRATUM 


IF  (NC0DE.LE.3)  THEN 
IF  (NCODE.EQ.l)  THEN 

CALL  CMHNN (NRCM , NROW , NCOL , NSTM , MATRIX , CMH) 

CMHRAN=CMH 

CALL  CMHNN 1 (NROW , NCOL , NSTM , MATRIX , CMH) 

CMHRAN1=CMH 

IF  (CMHRAN  .GT.  CMHOBS  .OR. 

CMHRAN  .EQ.  CMHOBS  .AND.  CMHRAN 1 .GE.  CMHOBS 1) 

1 CMHRAN  .EQ.  CMHOBS  .AND.  SNGL(SHOP)  .LE.  SNGL(POBS)) 

! S=S+1 


IF  (CMHRAN  .GE.  CMHOBS)  S=S+1 

ELSE 

IF  (NC0DE.EQ.2)  THEN 

CALL  CMHNO (NROW , NCOL , NSTM , MATRIX , CMH) 

CMHRAN=CMH 

CALL  CMHNN (NRCM , NROW , NCOL , NSTM , MATRIX , CMH) 

CMHRAN 1=CMH 

IF  (CMHRAN  .GT.  CMHOBS  .OR. 

CMHRAN  .EQ.  CMHOBS  .AND.  CMHRANl  .GE.  CMHOBSl) 

CMHRAN  .EQ.  CMHOBS  .AND.  SNGL(SHOP)  .LE.  SNGL(POBS)) 
S=S+1 


IF  (CMHRAN  .GE.  CMHOBS)  S=S+1 


ELSE 


CALL  CMHOO (NROW , NCOL , NSTM , MATRIX , CMH) 

CMHRAN=CMH 

CALL  CMHNO (NROW , NCOL , NSTM , MATRIX , CMH) 

CMHRAN 1=CMH 

IF  (CMHRAN  .GT.  CMHOBS  .OR. 

CMHRAN  .EQ.  CMHOBS  .AND.  CMHRANl  .GE.  CMHOBSl) 

CMHRAN  .EQ.  CMHOBS  .AND.  SNGL(SHOP)  .LE.  SNGL(POBS)) 
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2 S=S+1 

C IF  (CMHRAN  .GE.  CMHOBS)  S=S+1 

ENDIF 
END  IF 

ELSE 

IF  (NCODE.EQ.4)  THEN 

CALL  CMHNN 1 (NROW , NCOL , NSTM , MATRIX , CMH) 

CMHRAN=CMH 

C NO  MORE  GENERAL  STATISTIC  FOR  T^  USE  P({Z}) 

IF  (CMHRAN  .GT.  CMHOBS  .OR. 

1 CMHRAN  .EQ.  CMHOBS  .AND.  SNGL(SHOP)  .LE.  SNGL(POBS)) 

2 S=S+1 

C IF  (CMHRAN  .GE.  CMHOBS)  S=S+1 

ELSE 

IF  (NC0DE.EQ.5)  THEN 

CALL  CMHNO 1 (NROW , NCOL , NSTM , MATRIX , CMH) 

CMHRAN=CMH 

CALL  CMHNN 1 (NROW , NCOL , NSTM , MATRIX , CMH) 

CMHRAN 1=CMH 

IF  (CMHRAN  .GT.  CMHOBS  .OR. 

CMHRAN  .EQ.  CMHOBS  .AND.  CMHRAN 1 .GE.  CMHOBS 1) 

CMHRAN  .EQ.  CMHOBS  .AND.  SNGL(SHOP)  .LE.  SNGL(POBS)) 
S=S+1 

IF  (CMHRAN  .GE.  CMHOBS)  S=S+1 

ELSE 

CALL  CMHOO 1 (NROW , NCOL , NSTM , MATRIX , CMH) 

CMHRAN=CMH 

CALL  CMHNO 1 (NROW , NCOL , NSTM , MATRIX , CMH) 

CMHRAN 1=CMH 

IF  (CMHRAN  .GT.  CMHOBS  .OR. 

CMHRAN  .EQ.  CMHOBS  .AND.  CMHRANl  .GE.  CMHOBS 1) 

CMHRAN  .EQ.  CMHOBS  .AND.  SNGL(SHOP)  .LE.  SNGL(POBS)) 
S=S+1 

IF  (CMHRAN  .GE.  CMHOBS)  S=S+1 

ENDIF 
ENDIF 


ENDIF 
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IF  (NDATA  .NE.  1)  GO  TO  100 

PRINT*, 'THE  PRIMARY  SCORE  STATISTIC  FROM  RANDOM  TABLE  =',CMHRAN 
PRINT*, 'THE  SECONDARY  SCORE  STATISTIC  FROM  RANDOM  TABLE  =',CMHRAN1 
PRINT*, 'PROB  OF  RANDOM  TABLE  = ' , SNGL(SHOP) 

PRINT* , IJ , CMHRAN , CMHOBS , SNGL (S) , SNGL (SHOP) 

PRINT* 

100  CONTINUE 

ITNSIM=NSIM*ITCOUNT 

P_EXACT=S/ITNSIM 

VAR_P=P_EXACT* ( 1 . DO-P_EXACT) /ITNSIM 
CI1=P_EXACT-1 . 96D0*DSQRT(VAR_P) 

CI2=P_EXACT+1 . 96D0*DSQRT(VAR_P) 

STD_P=DSQRT(VAR_P) 

PRINT* 

PRINT*, 'UPDATED  ESTIMATE  OF  P_VALUE  = ' , SNGL (P_EXACT) 

PRINT*, 'UPDATED  ESTIMATE  OF  VAR_P  =' ,SNGL(VAR~P) 

PRINT*, 'UPDATED  ESTIMATE  OF  STD_P  = ' , SNGL (STD_P) 

PRINT* 

PRINT*, 'A  95*/.  CONFIDENCE  INTERVAL  FOR  UPDATED  ESTIMATE  OF  P : ' 
PRINT*,'  ',SNGL(CI1),SNGL(CI2) 

WRITE(*,125)  NSIM,ISEED 

125  F0RMAT(/,I8,'  TABLES  SAMPLED  WITH  CURRENT  STARTING  SEED ',18) 
WRITE(*,126)  ITNSIM 

126  FORMAT (18,'  TABLES  SAMPLED  TOTALLY') 

c*=(==)=***=ic******:t;***,,.„.*,^,^,^  simulation  ******************♦*:+:****♦ 


WRITE(*,110) 

110  FORMAT(/,'DO  YOU  WANT  TO  SAMPLE  MORE  TABLES  ? (Y=1,N=0)  :') 
READ(*,*)  MORET 

IF  (MORET  .EC).  1)  THEN 
WRITE(*,115) 

115  FORMAT (/, 'PLEASE  REENTER  THE  SEED  (INTEGER)  :') 

READ(*,*)  ISEED 
GO  TO  120 
ENDIF 
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PRINT* 
PRINT* , ' END ' 

1000  STOP 
END 


C****************  end  of  main  program  ****=('****>t:**=(t**5i<*>it***j(c:(t*!(c 


C234567 

SUBROUTINE  RC0NT2 ( IJ , KK , NROW , NCOL , NSTM , NROWTl , NCOLTl , JWORK , 

1 MATRIX 1 , KEY , IFAULT , NNTOTAL , ISEED , IDUM) 

C 

C ALGORITHM  AS  159  APPL.  STATIST.  (1981)  VOL. 30,  NO . 1 

C GENERATE  RANDOM  TWO-WAY  TABLE  WITH  GIVEN  MARGINAL  TOTALS 

C 

C CODES  ARE  MODIFIED  BY  DONGUK  KIM  TO  BE  USED  FOR  THE  GENERATION 

C OF  THREE  WAY  TABLES,  AND  DEXP,DBLE,AND  DLOG  ARE  USED  INSTEAD  OF 

C EXP,  FLOAT,  AND  LOG. 

C NNTOTAL  IS  THE  MAXIMUM  OF  NTOTAL  FOR  THE  STRATUM 

C AND  USED  IN  COMPUTING  LOG-FACTORALS . 

IMPLICIT  REAL*8  (A-H,0-Z) 

DIMENSION  NR0WTK50)  ,NC0LT1(50)  ,MATRIX1  (50 , 50) 

DIMENSION  JW0RK(50) 

DIMENSION  NR0WT2(50) ,NC0LT2(50) 

REAL*8  FACT(25001) 

LOGICAL  KEY 
LOGICAL  LSP,LSM 
COMMON  /B/  NROWM,NCOLM,FACT 
C 

COMMON  /TEMPRY/  HOP 
C 

DATA  MAXTOT  /25000/ 

C 

C IDUM=(KK+(IJ-1)*NSTM)*(-1) -ISEED 

C PRINT*, 'IDUM=' , IDUM 

C 

IFAULT=0 

DO  100  1=1, NROW 
IF  (NROWTl(I)  .LE.  0)  GOTO  214 
100  CONTINUE 
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NT0TAL=0 
DO  101  J=1,NC0L 
IF  (NCOLTl(J)  .LE.  0)  GOTO  215 
NT0TAL=NT0TAL+NC0LT1 ( J) 

101  CONTINUE 

IF  (NTOTAL  .GT.  MAXTOT)  GOTO  216 

IF  (KEY)  GOTO  103 
C 

C SET  KEY  FOR  SUBSEQUENT  CALLS 

C 

KEY=.TRUE. 

C 

C CHECK  FOR  FAULTS  AND  PREPARE  FOR  FUTURE  CALLS 

C 

IF  (NROW  .LE.  1)  GOTO  212 
IF  (NCOL  .LE.  1)  GOTO  213 
NR0WM=NR0W-1 
NC0LM=NC0L-1 

C 

C CALCULATE  LOG-FACTORIALS 

C 

X=O.DO 

FACT(1)=0.D0 

DO  102  I=1,NNT0TAL 

X=X+DL0G(DBLE(D) 

FACT(I+1)=X 

102  CONTINUE 

c print*, 'I  factorial' 

c do  90  i=l,20 

c print*,i,dexp(fact(i)) 

c90  continue 

C 

C 

c CONSTRUCT  RANDOM  MATRIX 

C 

c 

103  DO  105  J=1,NC0LM 
105  JW0RK(J)=NC0LT1(J) 

JC=NTOTAL 

C 


H0P=1.D0 
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C 

DO  190  L=1,NR0WM 
NR0WTL=NR0WT1 (L) 

IA=NROWTL 

IC=JC 

JC=JC-NROWTL 
DO  180  M=1,NC0LM 
ID=JWORK(M) 

IE=IC 

IC=IC-ID 

IB=IE-IA 

II=IB-ID 

C 

C TEST  FOR  ZERO  ENTRIES  IN  MATRIX 

C 

IF  (IE  .NE.  0)  GOTO  130 
DO  121  J=M,NCOL 
121  MATRIXKL,  J)=0 
GOTO  190 
C 

C GENERATE  PSEUDO-RANDUM  NUMBER 

C 

130  RAND=RAN1(IDUM) 

C 

C COMPUTE  CONDITIONAL  EXPECTED  VALUE  OF  MATRIX (L,M) 

C 

131  NLM=DBLE(IA*ID)/DBLE(IE)+0.5 
IAP=IA+1 

IDP=ID+1 

IGP=IDP-NLM 

IHP=IAP-NLM 

NLMP=NLM+1 

IIP=II+NLMP 

X=DEXP(FACT(IAP)+FACT(IB+1)+FACT(IC+1)+FACT(IDP)- 
1 F ACT ( I E+ 1 ) -F ACT (NLMP ) -FACT ( IGP ) -FACT ( IHP) -FACT (IIP)) 
IF  (X  .GE.  RAND)  GOTO  160 
SUMPRB=X 
Y=X 

NLL=NLM 
LSP=. FALSE. 

LSM=. FALSE. 

C 

C INCREMENT  ENTRY  IN  ROW  L,  COLUMN  M 

C 

140 


J=(ID-NLM)*(IA-NLM) 


IF  (J  .EQ.  0)  GOTO  156 
NLM=NLM+1 

X=X*DBLE ( J) /DBLE (NLM* (II+NLM) ) 
SUMPRB=SUMPRB+X 
IF  (SUMPRB  .GE.  RAND)  GOTO  160 
150  IF  (LSM)  GOTO  155 
C 

C DECREMENT  ENTRY  IN  ROW  L,  COLUMN  M 

C 

J=NLL*(II+NLL) 

IF  (J  .EQ.  0)  GOTO  154 
NLL=NLL-1 

Y=Y*DBLE(J)/DBLE((ID-NLL)*(IA-NLL)) 

SUMPRB=SUMPRB+Y 

IF  (SUMPRB  .GE.  RAND)  GOTO  159 

IF  (.NOT.LSP)  GOTO  140 

GOTO  150 

154  LSM=.TRUE. 

155  IF  (.NOT.LSP)  GOTO  140 
RAND=SUMPRB*RAN1 (IDUM) 

GOTO  131 

156  LSP=.TRUE. 

GOTO  150 

159  NLM=NLL 
C 

HOP=HOP*Y 
GOTO  161 

160  HOP=HOP*X 

161  MATRIX 1(L,M)=NLM 
C 

C160  MATRIX 1(L,M)=NLM 
IA=IA-NLM 

JWORK (M) = JWORK (M) -NLM 
180  CONTINUE 

MATRIX1(L,NC0L)=IA 
190  CONTINUE 
C 

C COMPUTE  ENTRIES  IN  LAST  ROW  OF  MATRIX 

C 

DO  192  M=1,NC0LM 
192  MATRIXl (NROW, M)= JWORK (M) 

MATRIX1(NR0W,NC0L)=IB-MATRIX1(NR0W,NC0LM) 


C 


PRINT*, 'HOP  = ' ,HOP 
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C 

C CHECK  THE  RANDOM  TABLES  SATISFY  FIXED  ROW  TOTALS  AND  COLUMN  TOTALS. 

C 

CALL  COMPTOTl (NROW , NCOL , MATRIX 1 , NR0WT2 , NC0LT2) 

DO  195  M=l,NROW 

IF  (NR0WT2(M)  .NE.  NROWTl(M))  GO  TO  200 
195  CONTINUE 

DO  197  M=1,NC0L 

IF  (NC0LT2(M)  .NE.  NCOLTl(M))  GO  TO  202 
197  CONTINUE 
RETURN 

C 

C SET  FAULTS 

C 

212  IFAULT=1 
RETURN 

213  IFAULT=2 
RETURN 

214  IFAULT=3 
RETURN 

215  IFAULT=4 
RETURN 

216  IFAULT=5 
RETURN 

200  PRINT*, M,'th  ROW  TOTAL  IS  WRONG.' 

RETURN 

202  PRINT*, M,'th  COLUMN  TOTAL  IS  WRONG.' 

RETURN 

END 


* Uniform  random  generator 


DOUBLE  PRECISION  FUNCTION  RANl(IDUM) 

IMPLICIT  REAL*8  (A-H,0-Z) 

REAL*8  R(97) 

PARAMETER  (Ml=259200 , IA1=7141 , IC1=54773 , RM1=3 . 8580247E-6) 
PARAMETER  (M2= 134456 , IA2=8121 , IC2=2841 1 , RM2=7 . 4373773E-6) 


PARAMETER  (M3=243000 , IA3=4561 , IC3=51349) 
DATA  IFF  /O/ 

IF  (IDUM.LT.O.OR.IFF.EQ.O)  THEN 
IFF=1 

IX1=M0D(IC1-IDUM,M1) 
IX1=M0D(IA1*IX1+IC1 ,M1) 

IX2=M0D(IX1,M2) 

IX1=M0D(IA1*IX1+IC1 ,M1) 

IX3=MOD(IXl,M3) 

DO  11  J=l,97 

IX1=MDD(IA1*IX1+IC1,M1) 
IX2=M0D(IA2*IX2+IC2,M2) 
R(J)=(DBLE(IX1)+DBLE(IX2)*RM2)*RM1 
11  CONTINUE 

IDUM=1 
END  IF 

IX1=M0D(IA1*IX1+IC1 ,M1) 
IX2=M0D(IA2*IX2+IC2,M2) 
IX3=M0D(IA3*IX3+IC3,M3) 

J=1+(97*IX3)/M3 
IF(J.GT.97.0R.  J.LT.DPAUSE 
RAN1=R(J) 

R(J)=(DBLE(IX1)+DBLE(IX2)*RM2)*RM1 

RETURN 

END 


C234567 

SUBROUTINE  GETWTS (NROW , NCOL , WTR , WTC , NCODE) 
IMPLICIT  REAL*8  (A-H,0-Z) 

INTEGER  NROW, NCOL 
REAL*8  WTR(50) ,WTC(50) 

IF (NCODE  .EQ.  2 .OR.  NCODE  .EQ.B)  GO  TO  105 
WRITE(*,100) 

100  FORMAT (/, 'ENTER  ROW  SCORES:  ') 

READ(*,*) (WTR(I) ,1=1, NROW) 

105  WRITE(*,110) 

110  FORMAT (/, 'ENTER  COLUMN  SCORES:  ') 

READ(*,*) (WTC(J) , J=1,NC0L) 

RETURN 

END 
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C SCORE  STATISTICS  1 


C234567 

SUBROUTINE  CMHNN (NRCM , NROW , NCOL , NSTM , MATRIX , CMH) 

C TO  COMPUTE  SCORE  STATISTIC 

C FOR  THE  TEST  OF  THE  CONDITIONAL  INDEPENDENCE  OF  THE  I*J*K  TABLES 

C WHEN  X IS  NOMINAL  AND  Y IS  NOMINAL. 

C COMMON  IS  USED  FOR  NIK,NJK,NTOT 

IMPLICIT  REALMS  (A-H,0-Z) 

PARAMETER(lda=250) 

DIMENSION  MATRIX(20,50,50) ,NIK(50,20) ,NJK(50,20) ,NT0T(20) 

REALMS  D(50,50) ,Dl(lda) ,VK(20,lda,lda) ,V(lda,lda) 

REALMS  DlV(lda) ,det(2) 

c REAL*8  D(50, 50) ,D1(NRCM) ,VK(20, NRCM, NRCM) ,V(NRCM, NRCM) 

c REALMS  DIV(NRCM) ,det(2) 

c realms  VINV(lda,lda) 

integer  Ida, NRCM, info , job 
LOGICAL  KIM 

COMMON  /Al/  NIK,NJK,NTOT 
COMMON  /A3/  D,Dl,VK,V,DIV,det 

NR0WM=NR0W-1 

NC0LM=NC0L-1 

C COMPUTE  D((NROWM*NCOLM) ,1)  VECTOR 

DO  100  I=1,NR0WM 
DO  105  J=1,NC0LM 
D(I,  J)=O.DO 
DO  no  K=1,NSTM 

D(I, J)=D(I,J)+(MATRIX(K,I,J)-(NIK(I,K)*NJK(J,K))/DBLE(NTOT(K))) 

110  continue 
105  CONTINUE 
100  CONTINUE 

DO  115  I=1,NR0WM 
DO  120  J=1,NC0LM 
K=(I-1)*NC0LM+J 
120  D1(K)=D(I,J) 

115  CONTINUE 


c 

c 

c 

c 

122 

160 

150 

140 

130 

125 

190 

180 

170 

195 
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IF  (KIM)  GO  TO  15 

SET  KIM  FOR  SUBSEQUENT  CALLS 


KIM=.TRUE. 

COMPUTE  V ( (NROWM+NCOLM) , (NROWM*NCOLM) ) MATRIX 

DO  125  K=1,NSTM 
L=0 

DO  130  I=1,NR0WM 
DO  140  J=1,NC0LM 
L=L+1 
M=0 

DO  150  IP=1,NR0WM 
DO  160  JP=1,NC0LM 
IND1=0 
IND2=0 

IF  (I.EQ.IP)  IND1=1 
IF  (J.EQ.JP)  IND2=1 
M=M+1 

VK(K,L,M)=NIK(I,K)*(IND1*NT0T(K)-NIK(IP,K)) 

L *NJK( J ,K) * (IND2*NT0T(K) -NJK( JP ,K) ) / 

^ DBLE(NT0T(K)*NT0T(K)*(NT0T(K)-1.D0)) 

CONTINUE 
CONTINUE 
CONTINUE 
CONTINUE 
CONTINUE 

DO  170  I=1,NRCM 

DO  180  J=1,NRCM 

V(I, J)=O.DO 

DO  190  K=1,NSTM 

V(I,J)=V(I,J)+VK(K,I,J) 

CONTINUE 

CONTINUE 

WRITE(*,195) 

FORMAT (/, 'PRINT  NULL  COVARIANCE  MATRIX  ? (Y=1,N=0)') 

READ(*,*)  NCOV 

IF  (NCOV  .NE.  1)  GO  TO  196 


print* 
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PRINT*, 'NULL  COVARIANCE  MATRIX 

DO  191  I=1,NRCM 

PRINT*, (SNGL(V(I,J)) ,J=1,NRCM) 

191  CONTINUE 

196  J0B=01 

n=NRCM 

c lda=NRCM 

CALL  dpof a(V,lda,n, inf o) 

IF  (INFO  .NE.  0)  THEN 
WRITE(*,99)  INFO 

99  F0RMAT(/, 'THE  FACTORIZATION  IS  NOT  COMPLETE.',/, 

1 'THE  LEADING  MINOR  OF  ORDER' , 15 ,' IS  NOT  POSITIVE  DEFINITE.') 
PRINT* 

END  IF 

CALL  dpodi(V,lda,n,det, job) 

C COMPUTE  DETERMINENT  AND  INVERSE  MATRIX  OF 

c A CERTAIN  REAL  SYMMETRIC  POSITIVE  DEFINITE  MATRIX. 

C ONLY  FOR  SYMMETRIC  MATRIX  ! ! 

c DPODI  PRODUCES  THE  UPPER  HALF  OF  INVERSE  OF  V. 

C RETURNED  V IS  THE  VAR-COV  MATRIX  OF  V. 

DO  5 1=2, n 
DO  6 J=1,I-1 

6 V(I,J)=V(J,I) 

5 CONTINUE 

7 WRITE(*,198) 

198  FORMAT (/, 'PRINT  INVERSE  MATRIX  OF  NULL  COV.  MATRIX  ? (Y=1,N=0)') 

READ(*,*)  NINVC 
IF  (NINVC  .NE.  1)  GO  TO  15 

PRINT* 

PRINT* , ' INVERSE  MATRIX  : ' 
do  10  i=l,n 

10  print*, (SNGL(V(i,j)) ,j=l,n) 


15  CALL  MULTVA(D1,V,NRCM,NRCM,DIV) 
CALL  INNER(DIV,D1,NRCM,CMHV) 
CMH=CMHV 

C PRINT*, 'SCORE  STATISTIC  =' ,CMH 
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RETURN 

END 

C SCORE  STATISTIC  2 


C234567 

SUBROUTINE  CMHNO (NROW , NCOL , NSTM , MATRIX , CMH) 

C TO  COMPUTE  SCORE  STATISTIC 

C FOR  THE  TEST  OF  THE  CONDITIONAL  INDEPENDENCE  OF  THE  I*J*K  TABLES 

C WHEN  X IS  NOMINAL  AND  Y IS  ORDINAL. 

C COMMON  IS  USED  FOR  NIK,NJK,NTOT 

C COMMON  IS  USED  FOR  WTR.WTC 

IMPLICIT  REALMS  (A-H,0-Z) 

PARAMETER(lda=250) 

PARAMETER (ldal=l 5) 

DIMENSION  MATRIX(20,50,50) ,NIK(50,20) ,NJK(50,20) ,NT0T(20) 

REALMS  WTR(50) ,WTC(50) 

C global  arrays 

REAL*8  PIK(50,20) ,PJK(50,20) 

REALMS  NK(20,lda,l) ,MK(20 ,lda, 1) ,VK(20 ,lda, Ida) 

REALMS  GK(20,15,15) , VGK(20 , 15 , 15) ,G(15,15) ,VG(15,15) ,GT(15,15) 
REAL*8  BK(lda,lda) ,BKT(lda,lda) ,CK(15,15) 

C local  arrays 

REAL*8  A(15,15) ,A1(15,15) , Cl (15, 15) ,C2 (15 , 15) , C3(15 , 15) 

REAL*8  D(15,15) ,B(15) ,Y(15) 

REAL*8  C(lda,lda) ,GNMK(lda,lda) ,YK(lda,lda) ,V(lda,lda) 

REAL*8  GTVG(15,15) 


int eger  Ida , Idal , NROWM , NNN , NRNC , inf o , j ob 
LOGICAL  KIM 

COMMON  /Al/  NIK,NJK,NTOT 
COMMON  /A4/  WTR.WTC 


NNN=1 

NROWM=NROW-l 

NRNC=NROW*NCOL 

IF  (KIM)  GO  TO  1000 
C 

C SET  KIM  FOR  SUBSEQUENT  CALLS 
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C 

KIM=.TRUE. 

C COMPUTE  NULL  VAR-COV  MATRIX  VG(NROWM,NRDWM) 

DO  100  K=1,NSTM 
DO  110  I=1,NR0W 

1 10  PIK ( I , K) =DBLE (NIK (I , K) ) /DELE (NTOT (K) ) 

DO  120  J=1,NC0L 

120  PJK(J,K)=DBLE(NJK(J,K))/DBLE(NTOT(K)) 

100  CONTINUE 



C COMPUTE  Mk=E(Nk|HO) ,WHICH  IS  SAVED  IN  MK(K,NRNC,1) 
C 

DO  200  K=1,NSTM 

DO  230  I=1,NR0W 
230  A(I,1)=PIK(I,K) 

DO  240  J=1,NC0L 
240  A1(J,1)=PJK(J,K) 

CALL  DIRECTMM ( A , A 1 , C , NROW , NNN , NCOL , NNN) 

C **********  :)c  =|c  **  j(c  Xc  =)c  =)c  :(c  ***  >|c  !)c  ******  =(c  *:(<*  ;t:  ***********  ** 

DO  250  I=1,NRNC 
250  MK(K,I,1)=NT0T(K)*C(I,1) 

DO  255  1=1,15 
DO  256  J=l,15 
A(I, J)=O.DO 
Aid,  J)=O.DO 

256  CONTINUE 
255  CONTINUE 

DO  257  1=1, Ida 
DO  257  J=1,NNN 

257  C(I,J)=O.DO 

200 
C 

C— 


CONTINUE 
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C COMPUTE  Var(Nk|HO),WHICH  IS  SAVED  IN  VK(K,I,J) 

C 

DO  350  K=1,NSTM 

DO  260  J=1,NC0L 
B(J)=PJK(J,K) 

A(J,1)=PJK(J,K) 

A1(1,J)=PJK(J,K) 

260  CONTINUE 

CALL  MMULTM (A , A1 , NCOL , NNN ,NCOL , Cl ) 

CALL  DIAG(B,NCOL,D) 

CALL  MSUBTM(D, Cl, NCOL, NCOL, C2) 

DO  265  1=1,15 
265  B(I)=O.DO 

DO  267  1=1,15 
DO  268  J=l,15 
A(I, J)=O.DO 
Aid,  J)=O.DO 
C1(I, J)=O.DO 
D(I, J)=O.DO 
268  CONTINUE 
267  CONTINUE 

DO  270  I=1,NR0W 
B(I)=PIK(I,K) 

A(I,1)=PIK(I,K) 

A1(1,I)=PIK(I,K) 

270  CONTINUE 

CALL  MMULTM ( A , A 1 , NRO W , NNN , NRO W , C 1 ) 

CALL  DIAG(B,NROW,D) 

CALL  MSUBTM(D,C1,NR0W,NR0W,C3) 

C******:|cj)c*>|c^cj|c*)(c*:)c*j|c:(oK5(ot::t:****!(<********s|c:(c*:t:*****!|t**** 

CALL  D IRECTMM ( C3 , C2 , C , NRO W , NRO W , NCOL , NCOL ) 

C * =fc  =tc  J|c  ♦♦  =(c  J|C  jK  * J)C  * :^o|c  ,|o(c ,|c  :(c  ,K  ^ ^ 

DO  280  I=1,NRNC 
DO  290  J=1,NRNC 

290  VK(K , I , J)=DBLE(NTOT(K) *NTOT(K) )/DBLE(NTOT(K) -1) *C (I , J) 

280  CONTINUE 
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DO  300  1=1,15 
DO  310  J=l,15 
A(I, J)=O.DO 
Aid,  J)=O.DO 
C1(I, J)=O.DO 
C2(I, J)=O.DO 
C3(I, J)=O.DO 
D(I, J)=O.DO 
310  CONTINUE 
300  CONTINUE 

DO  320  1=1, Ida 
DO  330  J=l,lda 
330  C(I,J)=O.DO 
320  CONTINUE 

DO  340  1=1,15 
340  B(I)=O.DO 

350  CONTINUE 

C 





C COMPUTE  SCORE  MATRIX  BK(NR0WM,NRNC)=CK(1 ,NCOL)@RK(NROWM,NROW) 

DO  400  I=1,NR0WM 
400  Y(I)=1.D0 

YY=-1.D0 

CALL  AUGMD(Y,NROWM,YY,D) 

C D(NROWM,NROW)=RK 

C CK(1,NC0L)  IS  COLUMN  SCORES. 

DO  410  J=1,NC0L 
410  CK(1, J)=WTC(J) 

C**>(c****=(c**  + **:(c=(c=|c****j|c**:)c**:(c**:(c)|t*=(c****:tc**:t:>(c*>|c 

CALL  DIRECTMM (D , CK , BK , NROWM , NROW , NNN , NCOL) 

CALL  TRANS (BK, NROWM, NRNC,BKT) 

C BKT(NRNC, NROWM)  IS  TRANSPOSE  OF  BK (NROWM, NRNC) . 

DO  446  1=1,15 
Y(I)=O.DO 
DO  447  1=1,15 


446 
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DO  447  J=l,15 
447  D(I,J)=O.DO 



C COMPUTE  VG(NROWM,NROWM) . THIS  IS  SUMMING  VGK(K,NROWM,NROWM) 
C .WHICH  IS  Bk(VAR(Nk|HO)Bk' , OVER  K STRATUM. 

C 

DO  450  K=1,MSTM 

DO  460  I=1,NRNC 
DO  470  J=1,NRNC 
470  GNMK(I,J)=VK(K,I,J) 

460  CONTINUE 

CALL  MMULTM 1 ( BK , GNMK , NRO WM , NRNC , NRNC , YK) 

CALL  MMULTM 1 ( YK , BKT , NROWM , NRNC , NROWM , V) 

DO  475  1=1, NROWM 
DO  480  J=l, NROWM 
480  VGK(K,I,J)=V(I,J) 

475  CONTINUE 

DO  485  1=1, Ida 
DO  490  J=l,lda 
GNMKd,  J)=O.DO 
YK(I, J)=O.DO 
V(I,J)=O.DO 
490  CONTINUE 
485  CONTINUE 

450  CONTINUE 

DO  530  1=1, NROWM 
DO  540  J=l, NROWM 
VG(I, J)=O.DO 
DO  550  K=1,NSTM 

550  VG(I,J)=VG(I,J)+VGK(K,I,J) 

540  CONTINUE 
530  CONTINUE 

C VG (NROWM, NROWM)  IS  VAR-COV  MATRIX. 

WRITE(*,600) 

600  FORMAT (/, 'PRINT  NULL  COVARIANCE  MATRIX  ? (Y=1,N=0)') 
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READ(*,*)  MCOV 

IF  (NCOV  .NE.  1)  GO  TO  620 


print* 

PRINT*, 'MULL  COVARIANCE  MATRIX 

DO  610  I=1,NR0WM 

PRINT* , (SMGL(VG(I , J) ) , J=1 ,NROWM) 

610  CONTINUE 

620  J0B=01 

n=NROWM 

c Ida=MRCM 

CALL  dpofa(VG,ldal,n, info) 

IF  (INFO  .ME.  0)  THEM 
WRITE(*,699)  INFO 

699  FORMATC/, 'THE  FACTORIZATION  IS  NOT  COMPLETE.',/, 

1 'THE  LEADING  MINOR  OF  ORDER', 15,'  IS  NOT  POSITIVE  DEFINITE.') 
PRINT* 

ENDIF 

CALL  dpodi (VG , Idal ,n, det , job) 

DO  605  1=2, n 
DO  606  J=1,I-1 
606  VG(I,J)=VG(J,I) 

605  CONTINUE 

7 WRITE(*,698) 

698  FORMAT (/, 'PRINT  INVERSE  MATRIX  OF  NULL  COV.  MATRIX  ? (Y=1,N=0)') 
READ(*,*)  NINVC 
IF  (NINVC  .NE.  1)  GO  TO  1000 

PRINT* 

PRINT* , ' INVERSE  MATRIX  : ' 
do  690  i=l,n 

690  print*, (SMGL(VG(i,j)),j=l,n) 

C COMPUTE  G(NR0WM,1).  THIS  IS  SUMMING  GK(K,NROWM, 1) 

C , WHICH  IS  Bk(Mk-Mk),  OVER  K STRATUM. 

C G(MR0WM,1)  DEPENDS  ON  DATA  Nk. 


1000  DO  1005  K=1,NSTM 
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DO  1010  I=1,NR0W 

DO  1020  J=1,NC0L 

IJ=(I-1)*NC0L+J 

NK (K , IJ , 1) =MATRIX (K , I , J) 

1020  CONTINUE 
1010  CONTINUE 

DO  1030  I=1,NRNC 
GNMK(I,1)=NK(K,I,1)-MK(K,I,1) 

C NK(K,I,1)  IS  DEFINED  AS  REAL*8 
1030  CONTINUE 

C ARRAYS  ARE  (Ida, Ida).  MMULTMl  IS  CALLED  INSTEAD  OF  MMULTM. 
CALL  MMULTMl (BK , GNMK , NROWM, NRNC , NNN , YK) 

DO  1040  1=1, NROWM 
GK(K,I,1)=YK(I,1) 

1040  CONTINUE 

DO  1050  1=1, Ida 
GNMK(I,1)=0.D0 
YK(I,1)=0.D0 
1050  CONTINUE 

1005  CONTINUE 

DO  1060  1=1, NROWM 
DO  1070  J=1,NNN 
G(I, J)=O.DO 
DO  1080  K=1,NSTM 
1080  G(I,J)=G(I,J)+GK(K,I,J) 

1070  CONTINUE 
1060  CONTINUE 

C 

C COMPUTE  SCORE  STATISTIC 
C CMH=G' (VG-'-l)G 
C 

C COMPUTE  TRANSPOSE  OF  G (NROWM, 1) 
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DO  1100  I=1,NR0WM 
1100  GT(1,I)=G(I,1) 

CALL  MMULTM ( GT , VG , NNN , NRO WM , NRO WM , GTVG ) 

DO  1200  1=1,15 
B(I)=O.DO 
Y(I)=O.DO 
1200  CONTINUE 

DO  1210  I=1,NR0WM 
B(I)=GTVG(1,I) 

Y(I)=G(I,1) 

1210  CONTINUE 

CALL  INNER1(B,Y,NR0WM,CMHV) 

CMH=CMHV 

C PRINT*, 'C-M-H  STATISTIC  =\CMH 

DO  1212  1=1,15 
B(I)=O.DO 
Y(I)=O.DO 
1212  CONTINUE 

DO  1215  1=1,15 
DO  1215  J=l,15 
GTVG(I,J)=O.DO 
1215  D(I,J)=0.D0 

RETURN 

END 

C SCORE  STATISTIC  3 


C234567 


SUBROUTINE  CMHOO (NROW , NCOL , NSTM, MATRIX , CMH) 


C 

C 

C 

C 

C 


TO  COMPUTE  SCORE  STATISTIC 

FOR  THE  TEST  OF  THE  CONDITIONAL  INDEPENDENCE  OF  THE  I*J*K  TABLES 
WHEN  X IS  ORDINAL  AND  Y IS  ORDINAL. 

COMMON  IS  USED  FOR  NIK,NJK,NTOT 
COMMON  IS  USED  FOR  WTR,WTC 


IMPLICIT  REAL*8  (A-H,0-Z) 

DIMENSION  MATRIX(20,50,50) ,NIK(50,20) ,NJK(50,20) ,NT0T(20) 
REAL*8  WTR(50) ,WTC(50) 
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COMMON  /Al/  NIK,NJK,NTOT 
COMMON  /A4/  WTR,WTC 


T=O.DO 

TT=O.DO 

DO  100  K=1,NSTM 
DO  110  I=1,NR0W 
DO  120  J=1,NC0L 

T=T+WTR (I) *WTC ( J) * (MATRIX (K , I , J) - (NIK ( I , K) *N JK ( J , K) ) 

1 /DBLE(NTOT(K))) 

c T1=WTR(I)*WTC(J)*(MATRIX(K,I,J)-(NIK(I,K)*NJK(J,K))/DBLE(NT0T(K))) 

c LINEAR  RANK  STATISTICS 

TT=TT+WTR ( I ) * WTC ( J ) ^MATRIX (K , I , J ) 
c TT1=WTR(I)*WTC(J)*MATRIX(K,I, J) 

c PRINT*, 'Tl=' ,T1, 'TT1=' ,TT1 

120  CONTINUE 
no  CONTINUE 
100  CONTINUE 

C CMH=T 

CMH=TT 

c PRINT*, 'T=' ,T, 'TT=' ,TT 

RETURN 

END 

C SCORE  STATISTIC  4 

C234567 

SUBROUTINE  CMHNN 1 (NROW , NCOL , NSTM , MATRIX , CMH) 

C TO  COMPUTE  SCORE  STATISTIC 

C FOR  THE  TEST  OF  THE  CONDITIONAL  INDEPENDENCE  OF  THE  I*J*K  TABLES 

C WHEN  X IS  NOMINAL  AND  Y IS  NOMINAL. 

C W/0  ASSUMING  NO-THREE  FACTOR  INTERACTION  MODEL. 

C MAX  NO.  OF  STRATUM:  10 

C MAX  NO.  OF  ROW*COL  : 250 

C COMMON  IS  USED  FOR  NIK,NJK,NTOT 
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IMPLICIT  REAL*8  (A-H,0-Z) 

DIMENSION  MATRIX(20,50,50) ,NIK(50,20) ,NJK(50,20) ,NT0T(20) 
COMMON  /Al/  NIK,NJK,NTOT 


X=O.DO 

DO  100  K=1,NSTM 
DO  110  I=1,NR0W 
DO  110  J=1,NC0L 

EV=(NIK(I,K)*NJK(J,K))/DBLE(NTOT(K)) 
X=X+ ( (DELE (MATRIX (K , I , J) ) -EV) **2) /EV 
c print*, nik(i,k) ,njk(j ,k) ,ntot(k) 

c print*, matrix(k, i, j) ,ev,x 

110  CONTINUE 
100  CONTINUE 
CMH=x 

c print*, 'SCORE  STATISTIC  = ' ,CMH 

RETURN 

END 


C SCORE  STATISTIC  5 

C234567 

SUBROUTINE  CMHNO 1 (NROW , NCOL , NSTM , MATRIX , CMH) 

C TO  COMPUTE  SCORE  STATISTIC 

C FOR  THE  TEST  OF  THE  CONDITIONAL  INDEPENDENCE  OF  THE  I*J*K  TABLES 

C WHEN  X IS  NOMINAL  AND  Y IS  ORDINAL. 

C W/0  ASSUMING  NO-THREE  FACTOR  INTERACTION  MODEL. 

C MAX  NO.  OF  STRATUM:  10 

C MAX  NO.  OF  ROW*COL  : 250 

C COMMON  IS  USED  FOR  NIK,NJK,NTOT 

C COMMON  IS  USED  FOR  WTR,WTC 

IMPLICIT  REAL*8  (A-H,0-Z) 

DIMENSION  MATRIX(20,50,50) ,NIK(50,20) ,NJK(50,20) ,NT0T(20) 

REAL*8  WTR(50) ,WTC(50) 

REAL*8  UV(IOO) ,VK(10) ,GK(100,250) ,0(100,2500) ,DT(2500, 100) 

REAL*8  P(2500) , DP (2500 , 2500) , PP(2500, 2500) ,SIGMA(2500, 2500) 

REAL*8  DSIGMA(100,2500) ,C0VG(100, 100) ,DIV(100) 
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LOGICAL  KIM 

COMMON  /Al/  NIK,NJK,NTOT 
COMMON  /A4/  WTR,WTC 
COMMON  /A5/  P, DP, PP, SIGMA 


NNN=1 

NRNC=NROW*NCOL 

KNRNC=NSTM*NRNC 

NROWM=NROW-l 

NRNK=(NROW-l)*NSTM 

NT0TAL=0 
DO  100  K=1,NSTM 
100  NTOTAL=NTOTAL+NTOT(K) 
c print* , ' ntotal= ' , ntotal 

L=0 

DO  200  K=1,NSTM 
DO  210  I=1,NR0WM 
L=L+1 

UV(L)=O.DO 
DO  220  J=1,NC0L 

UV (L) =UV (L) +WTC( J) * (MATRIX(K , I , J) - 

1 dble(nik(i,k)*njk(j,k))/dble(ntot(k))) 

220  CONTINUE 

UV (L) =UV (L) /DBLE (NTOTAL) 

210  CONTINUE 
200  CONTINUE 

IF  (KIM)  GO  TO  900 
C 

C SET  KIM  FOR  SUBSEQUENT  CALLS 

C 

KIM=.TRUE. 


C NULL  ASYMPTOTIC  COVARIANCE  OF  SCORES. 
C COMPUTE  GK(NRNK,NRNC) 

DO  250  K=1,NSTM 
VK(K)=O.DO 
DO  270  J=1,NC0L 

270  VK(K)=VK(K)+WTC(J)*NJK(J,K) 

250  CONTINUE 
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L=0 

DO  280  K=1,NSTM 
DO  290  I=1,NR0WM 
L=L+1 
M=0 

DO  300  IP=1,NR0W 
IND1=0 

IF  (I  .EQ.  IP)  IND1=1 
DO  310  JP=1,NC0L 
M=M+1 

GK(L,M)=(WTC(JP)*NTOT(K)-VK(K))* 

1 (NTOT (K) *IND1-NIK (I , K) ) /DELE (NTOT (K) *NTOT (K) ) 

310  CONTINUE 
300  CONTINUE 
290  CONTINUE 
280  CONTINUE 

C COMPUTE  D(NRNK,KNRNC) 

DO  320  I=1,NRNK 
DO  330  J=1,KNRNC 
330  D(I,J)=0.d0 

320  CONTINUE 

L=0 

DO  350  K=1,NSTM 
DO  360  I=1,NR0WM 
L=L+1 

DO  370  IJ=1,NRNC 

M=(K-1)*NRNC+IJ 

D(L,M)=GK(L,IJ) 

370  CONTINUE 
360  CONTINUE 
350  CONTINUE 

C COMPUTE  SIGMA (KNRNC,KNRNC)=DIAG(P)-PP^ 

L=0 

DO  400  K=1,NSTM 
DO  410  I=1,NR0W 
DO  420  J=1,NC0L 
L=L+1 

420  P (L) =DBLE (NIK ( I , K) *N JK ( J , K) ) /DELE ( (NTOT (K) *NTOTAL) ) 

410  CONTINUE 
400  CONTINUE 
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C P(KNRNC),  DP(KNRNC,KNRNC) 

CALL  DIAG1(P,KNRNC,DP) 

CALL  CMULR(P,KNRNC,NNN,KNRNC,PP) 

DO  500  I=1,KNRNC 
DO  510  J=1,KNRNC 
510  SIGMA(I,J)=DP(I,J)-PP(I,J) 

500  CONTINUE 

C COMPUTE  COV(G(P))=D  SIGMA  D'/NTOTAL 

C TRANSPOSE  OF  D(NRNK,KNRNC)  : DT(KNRNC,NRNK) 

DO  550  I=1,NRNK 
DO  560  J=1,KNRNC 
560  DT(J,I)=D(I,J) 

550  CONTINUE 

C COMPUTE  D(NRNK,KNRNC)*SIGMA(KNRNC,KNRNC)=DSIGMA(NRNK,KNRNC) 

C print*, 'dsigma(NRNK,knrnc) ' 

DO  600  I=1,NRNK 

DO  610  J=1,KNRNC 

DSIGMACI, J)=O.DO 
DO  620  K=1,KNRNC 

DSIGMACI, J)=DSIGMA (I, J)+D(I,K)*SIGMA(K,J) 

620  CONTINUE 

if  (dabs(dsigma(i,j))  .It.  l.Od-15)  dsigma(i , j ) =0 . dO 

610  CONTINUE 

600  CONTINUE 

c do  622  i=l,NRNK 

c622  print*, (sngl(dsigma(i,j)) ,j=l,knrnc) 

C COMPUTE  DSIGMA(NRNK,KNRNC)*DT(KNRNC,NRNK)=COVG(NRNK,NRNK) 
c print*, 'covg(NRNK,NRNK) ' 

DO  650  I=1,NRNK 

DO  660  J=1,NRNK 

COVGCI, J)=O.DO 
DO  670  K=1,KNRNC 

670  COVGCI, J)=COVG(I,J)+DSIGMA(I,K)*DT(K,J) 

660  CONTINUE 

c print*, CsnglCcovgCi,j)) , j=l,NRNK) 

650  CONTINUE 

C COMPUTE  ESTIMATE  COV  GCP) 

DO  700  I=1,NRNK 
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DO  710  J=1,NRNK 

710  COVG ( I , J ) =COVG ( I , J) /DELE (NTOTAL) 

700  CONTINUE 

WRITE(*,720) 

720  FORMAT (/, 'PRINT  NULL  COVARIANCE  MATRIX  ? (Y=1,N=0)') 

READ(*,*)  NCOV 

IF  (NCOV  .NE.  1)  GO  TO  760 

print* 

PRINT*, 'NULL  COVARIANCE  MATRIX  :' 

DO  750  I=1,NRNK 

PRINT* , (SNGL(COVG (I , J) ) , J=1 ,NRNK) 

750  CONTINUE 

760  J0B=01 

n=NRNK 
lda=100 

CALL  dpof a(COVG, lda,n, inf o) 

IF  (INFO  .NE.  0)  THEN 
WRITE(*,699)  INFO 

699  FORMAT(/, 'THE  FACTORIZATION  IS  NOT  COMPLETE.',/, 

1 'THE  LEADING  MINOR  OF  ORDER', 15,'  IS  NOT  POSITIVE  DEFINITE.') 
PRINT* 

END  IF 

CALL  dpodi (COVG , Ida, n, det ,j ob) 

DO  800  1=2, n 
DO  810  J=1,I-1 
810  C0VG(I,J)=C0VG(J,I) 

800  CONTINUE 

WRITE(*,850) 

850  FORMAT (/, 'PRINT  INVERSE  MATRIX  OF  MULL  COV.  MATRIX  ? (Y=1,M=0)') 
READ(*,*)  MINVC 
IF  (NINVC  .NE.  1)  GO  TO  900 

PRINT* 

PRINT* , ' INVERSE  MATRIX  : ' 
do  860  i=l,n 

860  print*, (SMGL(COVG(i , j ) ) ,j=l,n) 


C COMPUTE  SCORE  STATISTIC  : UV'  COVG'-l  UV 
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900  CALL  MULTVA2(UV,C0VG,NRNK,NRNK,DIV) 

CALL  INNER2(DIV,UV,NRNK,CMHV) 

CMH=CMHV 

C PRINT*, 'SCORE  STATISTIC  FOR  RANDOM  TABLE  =' ,CMH 

RETURN 

END 


C SCORE  STATISTIC  6 

C234567 

SUBROUTINE  CMHOO 1 (NROW , NCOL , NSTM , MATRIX , CMH) 

C TO  COMPUTE  SCORE  TEST  STATISTI 

C FOR  THE  TEST  OF  THE  CONDITIONAL  INDEPENDENCE  OF  THE  I*J*K  TABLES 

C WHEN  X IS  ORDINAL  AND  Y IS  ORDINAL. 

C W/0  ASSUMING  NO-THREE  FACTOR  INTERACTION  MODEL. 

C MAX  NO.  OF  STRATUM:  10 

C MAX  NO.  OF  ROW*COL  : 250 

C COMMON  IS  USED  FOR  NIK,NJK,NTOT 

C COMMON  IS  USED  FOR  WTR,WTC 

IMPLICIT  REAL*8  (A-H,0-Z) 

DIMENSION  MATRIX(20,50,50) ,NIK(50 , 20) ,NJK (50 , 20) ,NT0T(20) 

REAL*8  WTR(50) ,WTC(50) 

REAL*8  UV(IO) ,UK(10) ,VK(10) ,GK(10,250) ,D(10,2500) ,DT(2500,10) 
REAL*8  P(2500) ,DP(2500,2500) ,PP(2500 , 2500) , SIGMA(2500 , 2500) 

REAL*8  DSIGMA(10,2500) ,C0VG(10,10) ,DIV(10) 

LOGICAL  KIM 

COMMON  /Al/  NIK,NJK,NTOT 
COMMON  /A4/  WTR,WTC 
COMMON  /A5/  P, DP, PP, SIGMA 


NNN=1 

NRNC=NROW*NCOL 

KNRNC=NSTM*NRNC 


NT0TAL=0 
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DO  100  K=1,NSTM 
100  NTOTAL=NTOTAL+NTOT(K) 
c print*, 'ntotal='’ ,ntotal 

DO  200  K=1,NSTM 
UV(K)=O.DO 
DO  210  I=1,NR0W 
DO  220  J=1,NC0L 

UV (K) =UV (K) +WTR(I) *WTC ( J) * (MATRIX (K , I , J) - 
1 DBLE(NIK(I,K)*NJK(J,K))/DBLE(NTOT(K))) 
220  CONTINUE 
210  CONTINUE 

UV (K) =UV (K) /DBLE (NTOTAL) 

200  CONTINUE 

IF  (KIM)  GO  TO  900 
C 

C SET  KIM  FOR  SUBSEQUENT  CALLS 

C 

KIM=.TRUE. 


C NULL  ASYMPTOTIC  COVARIANCE  OF  SCORES. 

C COMPUTE  GK(NSTM,NRNC) 

DO  250  K=1,NSTM 
UK(K)=O.DO 
VK(K)=O.DO 
DO  260  I=1,NR0W 

260  UK(K)=UK(K)+WTR(I)*NIK(I,K) 

DO  270  J=1,NC0L 

270  VK(K)=VK(K)+WTC(J)*NJK(J,K) 

IJ=0 

DO  280  I=1,NR0W 
DO  290  J=1,NC0L 
IJ=IJ+1 

GK(K,IJ)=WTR(I)*WTC(J)-(WTR(I)*VK(K) 

1 +WTC(J)*UK(K))/DBLE(NTOT(K)) 

2 +UK (K) *VK (K) /DBLE (NTOT (K) *NTOT (K) ) 
290  CONTINUE 

280  CONTINUE 
250  CONTINUE 

C COMPUTE  D(NSTM,KNRNC) 

DO  300  K=1,NSTM 


DO  305  IJ=1,KNRNC 
305  D(K,IJ)=0.d0 

300  CONTINUE 

c print*,  M(k,l)  ' 

DO  310  K=1,NSTM 
DO  320  IJ=1,NRNC 
L=(K-1)*NRNC+IJ 
D(K,L)=GK(K,IJ) 

320  CONTINUE 
310  CONTINUE 

C COMPUTE  SIGMA (KNRNC,KNRNC)=DIAG(P)-PP^ 

L=0 

DO  400  K=1,NSTM 
DO  410  I=1,NR0W 
DO  420  J=1,NC0L 
L=L+1 

420  P (L) =DBLE (NIK ( I , K) *N JK ( J , K) ) /DELE ( (NTOT (K) *NTOTAL) ) 

410  CONTINUE 
400  CONTINUE 

C P ( KNRNC ) , DP ( KNRNC , KNRNC ) 

CALL  DIAG1(P, KNRNC, DP) 

CALL  CMULR(P, KNRNC, NNN, KNRNC, PP) 

DO  500  1=1, KNRNC 
DO  510  J=l, KNRNC 
510  SIGMA(I,J)=DP(I,J)-PP(I,J) 

500  CONTINUE 

C COMPUTE  COV(G(P))=D  SIGMA  DVNTOTAL 

C TRANSPOSE  OF  D(NSTM, KNRNC)  : DT (KNRNC ,NSTM) 

DO  550  I=1,NSTM 
DO  560  J=l, KNRNC 
560  DT(J,I)=D(I,J) 

550  CONTINUE 

C COMPUTE  D(NSTM,KNRNC)*SIGMA(KNRNC,KNRNC)=DSIGMA(NSTM, KNRNC) 
c print* ,' dsigma(nstm,knrnc) ' 

DO  600  I=1,NSTM 
DO  610  J=l, KNRNC 
DSIGMA(I,J)=O.DO 
DO  620  K=l, KNRNC 
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DSIGMAd,  J)=DSIGMA(I,  J)+D(I,K)*SIGMA(K,  J) 
c print*, i,j ,D(I,K) ,SIGMA(K,J) ,D(I,K)*SIGMA(K, J) ,DSIGMA(I, J) 

620  CONTINUE 

if  (dabs(dsigma(i,j))  .It.  l.Od-15)  dsigma(i, j)=0.d0 
610  CONTINUE 

600  CONTINUE 

C COMPUTE  DSIGMA(NSTM,KNRNC)*DT(KNRNC,NSTM)=COVG(NSTM,NSTM) 
c print*, 'covg(nstm,nstm) ' 

DO  650  I=1,NSTM 
DO  660  J=1,NSTM 
COVGd,  J)=O.DO 
DO  670  K=1,KNRNC 

670  COVGd,  J)=COVG(I,J)+DSIGMAd,K)*DT(K,J) 

660  CONTINUE 
650  CONTINUE 

C COMPUTE  ESTIMATE  COV  G(P) 

DO  700  I=1,NSTM 
DO  710  J=1,NSTM 

710  COVG (I , J) =COVG (I , J) /DELE (NTOTAL) 

700  CONTINUE 

WRITE(*,720) 

720  FORMAT (/, 'PRINT  NULL  COVARIANCE  MATRIX  ? (Y=1,N=0)') 

READ(*,*)  NCOV 

IF  (NCOV  .NE.  1)  GO  TO  760 

print* 

PRINT*, 'NULL  COVARIANCE  MATRIX  :' 

DO  750  I=1,NSTM 

PRINT* , (SNGL (COVG (I , J) ), J=1 , NSTM) 

750  CONTINUE 


760  J0B=01 

n=NSTM 
lda=10 

CALL  dpof a(COVG , lda,n, inf o) 

IF  (INFO  .NE.  0)  THEN 
WRITE(*,699)  INFO 

699  FORMAT(/, 'THE  FACTORIZATION  IS  NOT  COMPLETE.',/, 

1 'THE  LEADING  MINOR  OF  ORDER ',15,'  IS  NOT  POSITIVE  DEFINITE.') 
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PRINT* 

ENDIF 


CALL  dpodi(COVG,lda,n,det , job) 


DO  800  1=2, n 
DO  810  J=1,I-1 
810  COVGd,  J)=C0VG(J,I) 

800  CONTINUE 

WRITE(*,850) 

850  FORMAT (/, 'PRINT  INVERSE  MATRIX  OF  NULL  COV.  MATRIX  ? (Y=1,N=0)') 
READ(*,*)  NINVC 
IF  (NINVC  .NE.  1)  GO  TO  900 

PRINT* 

PRINT* , ' INVERSE  MATRIX  : ' 
do  860  i=l,n 

860  print* , (SNGL(COVG(i , j ) ) , j=l ,n) 

C COMPUTE  SCORE  TEST  STATISTIC  : UV'  COVG"-!  UV 

900  CALL  MULTVA1(UV,C0VG,NSTM,NSTM,DIV) 

CALL  INNER1(DIV,UV,NSTM,CMHV) 

CMH=CMHV 

C PRINT*, 'SCORE  STATISTIC  FOR  RANDOM  TABLE  =' ,CMH 

RETURN 

END 
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